33 releases (9 breaking)
Uses new Rust 2024
| new 0.10.7 | Mar 20, 2026 |
|---|---|
| 0.9.0 | Mar 14, 2026 |
#433 in HTTP server
190KB
2.5K
SLoC
ds-api
A Rust SDK for building LLM agents on top of DeepSeek (and any OpenAI-compatible API). Define tools in plain Rust, plug them into an agent, and consume a stream of events as the model thinks, calls tools, and responds.
Quickstart
Set your API key and add the dependency:
export DEEPSEEK_API_KEY="sk-..."
# Cargo.toml
[dependencies]
ds-api = "0.10.3"
futures = "0.3"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
use ds_api::{AgentEvent, DeepseekAgent, tool};
use futures::StreamExt;
use serde_json::{Value, json};
struct Search;
#[tool]
impl ds_api::Tool for Search {
/// Search the web and return results.
/// query: the search query
async fn search(&self, query: String) -> Value {
json!({ "results": format!("results for: {query}") })
}
}
#[tokio::main]
async fn main() {
let token = std::env::var("DEEPSEEK_API_KEY").unwrap();
let mut stream = DeepseekAgent::new(token)
.add_tool(Search)
.chat("What's the latest news about Rust?");
while let Some(event) = stream.next().await {
match event.unwrap() {
AgentEvent::Token(text) => print!("{text}"),
AgentEvent::ToolCall(c) => println!("\n[calling {}]", c.name),
AgentEvent::ToolResult(r) => println!("[result] {}", r.result),
AgentEvent::ReasoningToken(t) => print!("{t}"),
}
}
}
The agent runs the full loop for you: it calls the model, dispatches any tool calls, feeds the results back, and keeps going until the model stops requesting tools.
Defining tools
Annotate an impl Tool for YourStruct block with #[tool]. Each method becomes a callable tool:
- Doc comment on the impl block → tool description
/// param: descriptionlines in each method's doc comment → argument descriptions- Return type just needs to be
serde::Serialize— the macro handles the JSON schema
use ds_api::tool;
use serde_json::{Value, json};
struct Calculator;
#[tool]
impl ds_api::Tool for Calculator {
/// Add two numbers together.
/// a: first number
/// b: second number
async fn add(&self, a: f64, b: f64) -> Value {
json!({ "result": a + b })
}
/// Multiply two numbers.
/// a: first number
/// b: second number
async fn multiply(&self, a: f64, b: f64) -> Value {
json!({ "result": a * b })
}
}
One struct can have multiple methods — they register as separate tools. Stack as many tools as you need with .add_tool(...).
Streaming
Call .with_streaming() to get token-by-token output instead of waiting for the full response:
let mut stream = DeepseekAgent::new(token)
.with_streaming()
.add_tool(Search)
.chat("Search for something and summarise it");
while let Some(event) = stream.next().await {
match event.unwrap() {
AgentEvent::Token(t) => { print!("{t}"); io::stdout().flush().ok(); }
AgentEvent::ToolCall(c) => {
// In streaming mode, ToolCall fires once per SSE chunk.
// First chunk: c.delta is empty, c.name is set — good moment to show "calling X".
// Subsequent chunks: c.delta contains incremental argument JSON.
// In non-streaming mode, exactly one ToolCall fires with the full args in c.delta.
if c.delta.is_empty() { println!("\n[calling {}]", c.name); }
}
AgentEvent::ToolResult(r) => println!("[done] {}: {}", r.name, r.result),
_ => {}
}
}
AgentEvent reference
| Variant | When | Notes |
|---|---|---|
Token(String) |
Model is speaking | Streaming: one fragment per chunk. Non-streaming: whole reply at once. |
ReasoningToken(String) |
Model is thinking | Only from reasoning models (e.g. deepseek-reasoner). |
ToolCall(ToolCallChunk) |
Tool call in progress | chunk.id, chunk.name, chunk.delta. Streaming: multiple per call. Non-streaming: one per call. |
ToolResult(ToolCallResult) |
Tool finished | result.name, result.args, result.result. |
Using a different model or provider
Any OpenAI-compatible endpoint works:
// OpenRouter
let agent = DeepseekAgent::custom(
"sk-or-...",
"https://openrouter.ai/api/v1",
"meta-llama/llama-3.3-70b-instruct:free",
);
// deepseek-reasoner (think before responding)
let agent = DeepseekAgent::new(token)
.with_model("deepseek-reasoner");
Custom top-level request fields (extra_body)
The library exposes an extra_body mechanism to let you merge arbitrary top-level JSON fields into the HTTP request body sent to the provider. This is useful for passing provider-specific or experimental options that are not (yet) modelled by the typed request structure.
There are two primary places you can attach extra_body fields:
- On an
ApiRequest(fine-grained, request-local) - On a
DeepseekAgent(convenient builder-style; merged into the next requests built from the agent)
Important notes
- Fields in
extra_bodyare flattened into the top-level JSON viaserde(flatten), so they appear as peers tomessages,model, etc. - Avoid key collisions with existing top-level names (e.g.
messages,model). The intended usage is adding provider-specific keys. - Agent-held
extra_bodymaps are merged into theApiRequestwhen the request is built. (If you want per-request control preferApiRequest::extra_body.)
Examples
- Using
ApiRequest:
use serde_json::{Map, json};
use ds_api::ApiRequest;
let mut m = Map::new();
m.insert("my_flag".to_string(), json!(true));
let req = ApiRequest::builder()
.messages(vec![])
.extra_body(m);
// send via ApiClient, or use within library internals that accept ApiRequest
- Using
DeepseekAgentbuilder helpers:
use serde_json::{Map, json};
use ds_api::DeepseekAgent;
let mut m = Map::new();
m.insert("provider_option".to_string(), json!("value"));
let agent = DeepseekAgent::new(token)
.extra_body(m) // merge these fields into subsequent requests
.chat("Hello world");
- Single-field helper:
let agent = DeepseekAgent::new(token)
.extra_field("provider_option", serde_json::json!("value"));
Any OpenAI-compatible endpoint works:
// OpenRouter
let agent = DeepseekAgent::custom(
"sk-or-...",
"https://openrouter.ai/api/v1",
"meta-llama/llama-3.3-70b-instruct:free",
);
// deepseek-reasoner (think before responding)
let agent = DeepseekAgent::new(token)
.with_model("deepseek-reasoner");
Injecting messages mid-run
You can send a message into a running agent loop — useful when the user types something while the agent is still executing tools.
The interrupt channel is attached with .with_interrupt_channel() and returns the agent plus a sender you can use from any task. The sender type (InterruptSender) is a re-export of tokio::sync::mpsc::UnboundedSender<String>, so it is cheap to clone and use concurrently:
let agent = DeepseekAgent::new(token)
.with_streaming()
.add_tool(SlowTool)
let tx = agent.interrupt_sender();
Behavior and semantics
- Sending an interrupt: call
tx.send("...".into()).unwrap()from any task or callback. The message will be delivered into the agent's conversation history. - During tool execution: the agent now actively listens for interrupts while a tool is running. If an interrupt message arrives while a tool is executing, the executor will:
- Immediately append the interrupt text to the conversation history as a
Role::Usermessage (and drain any queued interrupt messages in order). - Abort the currently running tool (the tool future is cancelled) and stop executing further tools for the current round. (can close only when the tool is awaiting)
- Record a placeholder result for the aborted tool (the runtime exposes this as an error-shaped JSON result), and then proceed to the next API turn so the model sees the injected user message.
- Immediately append the interrupt text to the conversation history as a
- Between turns / idle transition: any queued interrupts are drained before the next API call so injected messages are always visible to the model on the next turn.
Example: cancel a running tool and pivot
// Start the agent and get an interrupt sender.
let (agent, tx) = DeepseekAgent::new(token)
.with_streaming()
.add_tool(SlowTool)
.with_interrupt_channel();
// In another task (e.g. user action), send an interrupt to change the plan.
tx.send("Actually, cancel that and do X instead.".into()).unwrap();
// If the agent is currently executing a tool, that tool will be aborted and the
// interrupt will be pushed into history so the next API turn sees it.
let mut stream = agent.chat("Do the slow thing.");
Notes
InterruptSenderis non-blocking and can be cloned; use it from any async context without awaiting.- Aborting a tool is implemented by cancelling the tool future (via the runtime). This is effective for most async tools, but if a tool holds on to external, non-cancellable resources you may want to implement cooperative cancellation inside the tool (for example, by checking a cancellation token).
- The agent ensures interrupt message ordering by draining remaining queued interrupt messages when an interrupt is observed.
MCP tools
MCP (Model Context Protocol) lets you use external processes as tools — Node scripts, Python services, anything that speaks MCP over stdio:
// Requires the `mcp` feature
let agent = DeepseekAgent::new(token)
.add_tool(McpTool::stdio("npx", &["-y", "@playwright/mcp"]).await?);
Exposing tools as an MCP server
The mcp-server feature lets you turn any ToolBundle into a standalone MCP server so other LLM clients (Claude Desktop, MCP Studio, etc.) can call your Rust tools.
[dependencies]
ds-api = { version = "0.10", features = ["mcp-server"] }
tokio = { version = "1", features = ["full"] }
Stdio mode (Claude Desktop / MCP Studio)
Add this binary to your project and point Claude Desktop at it:
use ds_api::{McpServer, ToolBundle, tool};
struct Calculator;
#[tool]
impl ds_api::Tool for Calculator {
/// Add two numbers.
/// a: first operand
/// b: second operand
async fn add(&self, a: f64, b: f64) -> f64 { a + b }
/// Multiply two numbers.
/// a: first operand
/// b: second operand
async fn multiply(&self, a: f64, b: f64) -> f64 { a * b }
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
McpServer::new(ToolBundle::new().add(Calculator))
.with_name("my-calc-server")
.serve_stdio()
.await?;
Ok(())
}
Register it in claude_desktop_config.json:
{
"mcpServers": {
"my-calc": {
"command": "/path/to/your/binary"
}
}
}
HTTP mode (Streamable HTTP transport)
use ds_api::{McpServer, ToolBundle, tool};
struct Search;
#[tool]
impl ds_api::Tool for Search {
/// Search the web.
/// query: what to search for
async fn search(&self, query: String) -> String {
format!("results for: {query}")
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// MCP endpoint available at POST /mcp
McpServer::new(ToolBundle::new().add(Search))
.serve_http("0.0.0.0:3000")
.await?;
Ok(())
}
Custom routing
For custom Axum routing, use into_http_service() to get a Tower-compatible service:
use ds_api::{McpServer, ToolBundle};
use rmcp::transport::streamable_http_server::tower::StreamableHttpServerConfig;
let service = McpServer::new(ToolBundle::new().add(MyTools))
.into_http_service(Default::default());
let router = axum::Router::new()
.nest_service("/mcp", service)
.route("/health", axum::routing::get(|| async { "ok" }));
let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
axum::serve(listener, router).await?;
Tool Bundle
ToolBundle can handle multiple Tool implementations and builds a name->index map for dispatch.
Example
let group = ToolBundle::new()
.add(FileSpells)
.add(SearchSpells)
.add(ShellSpells);
let agent = DeepseekAgent::custom(...)
.add_tool(group)
.add_tool(UiSpells { ... })
.add_tool(SpawnSpell { ... });
Familiar
Familiar is a high-level agent built on top of ds-api. It provides opinionated defaults and a batteries-included experience for common agent patterns. Check out familiar
Contributing
PRs welcome. Keep changes focused; update public API docs when behaviour changes.
License
MIT OR Apache-2.0
Dependencies
~12–21MB
~300K SLoC