Skip to content

Releases: Lexmata/daimon

v0.16.0 — OpenSearch Vector Store Plugin

04 Mar 21:12
a6086f8

Choose a tag to compare

OpenSearch Vector Store Plugin

This release adds the second vector store plugin: OpenSearch k-NN.

New Crate: daimon-plugin-opensearch

An OpenSearch k-NN backed VectorStore for search-engine-native RAG workloads.

  • OpenSearchVectorStore — implements VectorStore (upsert, query, delete, count) using the official opensearch-rs client with k-NN queries
  • Space types — Cosine Similarity (cosinesimil), L2 (l2), Inner Product (innerproduct)
  • k-NN engines — Lucene (default), NMSLIB, FAISS — configurable per-index
  • HNSW indexing — configurable m and ef_construction parameters
  • Auto-index-creation — creates the k-NN index with correct mappings on first use (opt-out via builder)
  • Manual setupindex_settings module exports create_index_body() JSON
  • Custom transportbuild_with_client() accepts a pre-configured OpenSearch client for AWS SigV4, custom TLS, etc.
  • AWS auth — optional aws-auth feature enables Amazon OpenSearch Service SigV4 authentication

Quick Start

[dependencies]
daimon = { version = "0.16", features = ["opensearch", "openai"] }
use daimon::prelude::*;

let store = OpenSearchVectorStoreBuilder::new("http://localhost:9200", 1536)
    .index("my_docs")
    .space_type(OpenSearchSpaceType::CosineSimilarity)
    .engine(OpenSearchEngine::Lucene)
    .build()
    .await?;

let kb = SimpleKnowledgeBase::new(embedding_model, store);
kb.ingest(vec![Document::new("relevant text")]).await?;
let results = kb.search("query", 5).await?;

Full Changelog: v0.15.0...v0.16.0

v0.15.0 — Vector Store Plugins & pgvector

04 Mar 20:54
b39a4b8

Choose a tag to compare

Vector Store Plugins & pgvector

This release makes the VectorStore trait available to external plugin crates and ships the first vector store plugin: pgvector.

New Crate: daimon-plugin-pgvector

A pgvector-backed VectorStore for PostgreSQL RAG workloads.

  • PgVectorStore — implements VectorStore (upsert, query, delete, count) using tokio-postgres with deadpool-postgres connection pooling
  • Distance metrics — Cosine (<=>), L2 (<->), Inner Product (<#>) with matching HNSW operator classes
  • HNSW indexing — configurable m and ef_construction parameters
  • Auto-migration — creates the vector extension, table, and HNSW index on first use (opt-out via builder)
  • Manual SQLmigrations module exports CREATE_EXTENSION, create_table_sql(), create_hnsw_index_sql() for DIY schema management
  • Builder patternPgVectorStoreBuilder::new(conn_str, dimensions) with .table(), .distance_metric(), .pool_size(), .hnsw_m(), .hnsw_ef_construction(), .auto_migrate()

Architecture: VectorStore in daimon-core

Document, ScoredDocument, VectorStore, ErasedVectorStore, and SharedVectorStore have been moved from the main daimon crate to daimon-core. This follows the same pattern as the TaskBroker refactor in v0.13.0, enabling plugin crates to implement VectorStore by depending only on the lightweight daimon-core. The main daimon crate re-exports everything — no breaking changes for existing users.

Quick Start

[dependencies]
daimon = { version = "0.15", features = ["pgvector", "openai"] }
use daimon::prelude::*;

let store = PgVectorStoreBuilder::new("postgresql://user:pass@localhost/db", 1536)
    .table("my_docs")
    .distance_metric(DistanceMetric::Cosine)
    .build()
    .await?;

let kb = SimpleKnowledgeBase::new(embedding_model, store);
kb.ingest(vec![Document::new("relevant text")]).await?;
let results = kb.search("query", 5).await?;

Full Changelog: v0.14.0...v0.15.0

v0.14.0 — Performance & Benchmarking

04 Mar 20:34
3d170a9

Choose a tag to compare

Performance & Benchmarking

This release focuses on profiling hot paths across the framework and delivering measurable performance improvements, along with expanded benchmark coverage for recently added components.

Performance Improvements

Area Metric Improvement
ToolRegistry spec generation uncached 50-tool lookup -33% (10.4 µs → 6.9 µs)
Chain transforms 3-stage pipeline -30% (287 ns → 210 ns)
HotSwapAgent prompt simple prompt -18% (2.5 µs → 1.8 µs)
HotSwapAgent swap model swap -26% (140 ns → 112 ns)
DAG fan-out 3-way fan-out + merge -11% (10.9 µs → 10.5 µs)

What Changed

  • ToolRegistry: generation-based cache invalidationtool_specs() uses a generation counter to detect stale caches, avoiding redundant recomputation.
  • Memory contiguous slice cloneSlidingWindowMemory and TokenWindowMemory now use make_contiguous().to_vec() instead of iter().cloned().collect(), producing a single memcpy.
  • SlidingWindowMemory: single-pop eviction — replaced while loop with single if check since only one message is added at a time.
  • ReAct loop: reduced cloning — tool calls moved with std::mem::take instead of .to_vec(); middleware short-circuit paths move messages instead of cloning.
  • MiddlewareStack: early return when empty — all three middleware pipeline methods return Continue immediately when no middleware is registered.

New Benchmarks

  • HotSwapAgent (prompt, swap_model)
  • InProcessBroker (submit/receive/complete roundtrip)
  • InProcessEventBus (publish/receive)
  • InMemoryCheckpoint (save/load)
  • SerializableStreamEvent (serialize/deserialize)

Full Changelog: v0.13.0...v0.14.0

v0.13.0 — Cloud-Native Task Brokers

04 Mar 19:55
5e12ced

Choose a tag to compare

Cloud-Native Task Brokers

This release adds native cloud message broker implementations for distributed agent task execution, letting you use the same TaskBroker trait with your cloud provider's managed messaging service.

New Broker Implementations

Provider Broker Feature Flag Backend
AWS Bedrock SqsBroker sqs AWS SQS via aws-sdk-sqs
Google Gemini PubSubBroker pubsub Cloud Pub/Sub REST API
Azure OpenAI ServiceBusBroker servicebus Service Bus REST API

Architecture Change

The TaskBroker trait and core distributed types (AgentTask, TaskResult, TaskStatus, ErasedTaskBroker) have moved from the main daimon crate to daimon-core::distributed. This enables provider crates to implement cloud-native brokers without circular dependencies. Existing code is unaffected — the main crate re-exports everything at the same paths.

Quick Start

# Use AWS SQS for distributed tasks
daimon = { version = "0.13", features = ["bedrock", "sqs"] }

# Use Google Cloud Pub/Sub
daimon = { version = "0.13", features = ["gemini", "pubsub"] }

# Use Azure Service Bus
daimon = { version = "0.13", features = ["azure", "servicebus"] }

# Everything
daimon = { version = "0.13", features = ["full"] }

All Changes

See the full CHANGELOG for details.

v0.12.0 — Runtime & Persistence

04 Mar 19:36
08e6b10

Choose a tag to compare

What's New

NATS KV Checkpoint Backend (feature = "nats")

NatsKvCheckpoint stores checkpoints in NATS JetStream key-value buckets — distributed, replicated checkpoint storage with no external database required.

let cp = NatsKvCheckpoint::connect("nats://127.0.0.1:4222", "daimon-checkpoints").await?;
cp.save(&state).await?;
let loaded = cp.load("run-1").await?;

Redis Checkpoint Backend (feature = "redis")

RedisCheckpoint stores checkpoints in Redis hashes — fast, shared checkpoint storage accessible from multiple processes.

let cp = RedisCheckpoint::new("redis://127.0.0.1/", "daimon:checkpoints").await?;
cp.save(&state).await?;

Agent Hot-Reload

HotSwapAgent wraps an Agent behind a RwLock, enabling runtime reconfiguration without restart:

let hot = HotSwapAgent::new(agent);

// Use normally
let response = hot.prompt("Hello").await?;

// Swap model at runtime — next prompt uses the new model
hot.swap_model(new_model).await;
hot.swap_system_prompt(Some("New persona".into())).await;
hot.add_tool(my_tool).await;
hot.remove_tool("old_tool").await;

Supports swapping: model, system prompt, memory, hooks, middleware, guardrails, tool retry policy, temperature, max tokens, and max iterations. Clone-friendly — all clones share the same underlying agent.

Streaming Distributed Execution

StreamingTaskWorker uses Agent::prompt_stream() and publishes each StreamEvent through a TaskEventBus, enabling real-time observation of agent progress across process boundaries:

let bus = InProcessEventBus::new(64);
let worker = StreamingTaskWorker::new(broker, bus.clone(), agent_factory);

let mut rx = bus.subscribe();
tokio::spawn(async move { worker.run().await });

while let Ok(evt) = rx.recv().await {
    println!("{}: {:?}", evt.task_id, evt.event);
}

Serializable TaskStreamEvent / SerializableStreamEvent types support JSON round-tripping for transport over Redis pub/sub, NATS, WebSocket, or any custom bus.


Full Changelog: v0.11.0...v0.12.0

v0.11.0

04 Mar 18:43
5513423

Choose a tag to compare

Daimon v0.11.0

Massive feature release covering all development since v0.2.0.

Distributed Execution

  • TaskBroker trait with submit, status, receive, complete, fail methods
  • Redis broker (feature = "redis"), NATS JetStream broker (feature = "nats"), RabbitMQ broker (feature = "amqp")
  • gRPC broker (feature = "grpc") with GrpcBrokerServer + GrpcBrokerClient
  • InProcessBroker for local parallelism and testing
  • TaskWorker with single, continuous, and parallel execution modes
  • Distributed checkpoint sync with CheckpointSync and CheckpointReplicator

Agent Patterns

  • Agent-as-Tool, Supervisor, Handoff with max-handoff limits
  • Agent cloning/forking: fork(), fork_from_checkpoint(), fork_with_memory()
  • ForkBuilder for builder-style mutation of forked agent config

MCP Ecosystem

  • MCP Client with stdio and HTTP transports
  • MCP Server for exposing tool registries via JSON-RPC
  • WebSocket transport + server for persistent MCP connections
  • gRPC MCP transport + server (feature = "grpc" + feature = "mcp")

Orchestration

  • Chain, Graph, DAG, Workflow (Eino-style DAG with field mapping)

RAG Pipeline

  • VectorStore + KnowledgeBase plugin traits
  • InMemoryVectorStore, Qdrant retriever (feature = "qdrant")
  • Embeddings API with OpenAI, Ollama, Gemini, Azure, Bedrock providers

Safety & Quality

  • Middleware pipeline, input/output guardrails, ContentPolicyGuardrail
  • Self-healing tool retry with fixed and exponential backoff

Observability & Cost

  • OpenTelemetry export (feature = "otel")
  • Cost tracking with budget limits, streaming cost events

Developer Experience

  • Prompt templates, FewShotTemplate, DynamicContext
  • Structured output, evaluation harness with SemanticSimilarity and LlmJudge scorers
  • Time-travel debugging with inspect_run(), list_runs(), Agent::replay()

Deployment

  • AgentServer (axum) with API key auth
  • A2A protocol (v0.2) client and server

Full Changelog: https://github.com/Lexmata/daimon/blob/v0.11.0/CHANGELOG.md

v0.2.0

03 Mar 20:18

Choose a tag to compare

Added

  • Provider prompt caching across all backends:
    • Anthropic: with_prompt_caching() now correctly injects cache_control blocks on the system message and the last tool definition, enabling actual cache hits. Parses cache_creation_input_tokens and cache_read_input_tokens from usage with tracing.
    • OpenAI: Parses prompt_tokens_details.cached_tokens from the usage response (automatic prompt caching).
    • Azure OpenAI: Same as OpenAI — parses prompt_tokens_details.cached_tokens.
    • Google Gemini: New with_cached_content(name) builder method to reference a cachedContents/<id> resource. Parses cachedContentTokenCount from usage metadata.
    • AWS Bedrock: New with_prompt_caching() builder inserts CachePoint blocks after system messages and tool definitions. Parses cache_read_input_tokens from the Converse API response.
    • Ollama: New with_keep_alive(duration) builder to control KV cache retention (e.g. "5m", "1h", "0").
  • Usage::cached_tokens field — number of input tokens served from the provider's cache (subset of input_tokens, defaults to 0).

Fixed

  • Anthropic caching was non-functional: the with_prompt_caching() flag sent the beta header but never added cache_control content blocks, so no caching actually occurred. Now correctly marks the system prompt and last tool definition as cache breakpoints.

v0.1.0 — Daimon initial release

03 Mar 19:29

Choose a tag to compare

Daimon v0.1.0

Initial release of the Daimon agent framework — a Rust-native AI agent framework for building LLM-powered agents with tool use, memory, and streaming.

Highlights

  • Core ReAct agent loop with streaming, parallel tool execution, cancellation, and usage tracking
  • Five model providers (all feature-gated):
    • OpenAI (openai, default) — Chat Completions API, SSE streaming, response_format, parallel_tool_calls
    • Anthropic (anthropic, default) — Messages API, streaming, prompt caching
    • Google Gemini (gemini) — Generative Language REST API, Vertex AI support
    • Azure OpenAI (azure) — Azure deployments, API key + Entra ID auth
    • AWS Bedrock (bedrock) — Converse/ConverseStream API, guardrails
  • Tool system with ToolRegistry, parallel execution via JoinSet, typed outputs
  • Memory with SlidingWindowMemory and pluggable Memory trait
  • Lifecycle hooks via AgentHook for observability and control
  • Streaming with granular StreamEvent types
  • Observability via tracing::instrument on all agent and provider methods

Getting Started

[dependencies]
daimon = "0.1.0"
use daimon::prelude::*;

#[tokio::main]
async fn main() -> daimon::Result<()> {
    let agent = Agent::builder()
        .model(daimon::model::openai::OpenAi::new("gpt-4o"))
        .system_prompt("You are a helpful assistant.")
        .build()?;

    let response = agent.prompt("What is Rust?").await?;
    println!("{}", response.text());
    Ok(())
}

See the README and CHANGELOG for full details.