4 releases

new 0.2.1 Mar 8, 2026
0.2.0 Mar 8, 2026
0.1.1 Mar 5, 2026
0.1.0 Mar 5, 2026

#474 in Database interfaces

MIT license

1MB
18K SLoC

Engram

Engram

Rust-native AI agent memory system achieving 95.8% accuracy on LongMemEval-S — #1 globally, surpassing all known commercial systems.

Architecture

Conversation ──► Extraction ──► Qdrant ──► Retrieval ──► Answer
                 (gpt-4o-mini)  (vector)   (hybrid+RRF)  (Gemini + GPT-5.2 ensemble)

Extraction breaks conversations into typed facts across 4 epistemic collections:

  • world — objective facts about the world
  • experience — personal experiences and events
  • opinion — preferences, beliefs, evaluations
  • observation — behavioral patterns and inferred traits

Retrieval combines semantic search, keyword matching, temporal filtering, and entity-linked lookup, fused via Reciprocal Rank Fusion (RRF).

Features

  • 5-dimensional fact extraction (entities, epistemic type, temporality, source, confidence)
  • 4 epistemic Qdrant collections with type-specific schemas
  • Hybrid retrieval: semantic + keyword + temporal + entity-linked
  • RRF fusion with configurable k parameter
  • Agentic answering loop with tool use
  • MCP server for integration with Claude Desktop, Cursor, and other MCP clients
  • Full LongMemEval-S benchmark harness with 3-tier testing

MCP Quickstart

Engram exposes an MCP server (engram-server --mode mcp) that any MCP-compatible client can use to store and query long-term memory.

1. Start Qdrant

docker compose up -d qdrant --wait

2. Build engram-server

cargo build --release -p engram-server

3. Configure your MCP client

Claude Desktop — copy config/claude_desktop_config.example.json into ~/Library/Application Support/Claude/claude_desktop_config.json and update the binary path and API key.

Cursor — copy config/cursor_mcp.example.json into your Cursor MCP settings and update the binary path and API key.

Docker — start Qdrant via Docker Compose, then point your MCP client at the local binary:

docker compose up -d qdrant --wait   # start Qdrant only
# Then configure Claude Desktop / Cursor to run engram-server directly

Available Tools

Tool Description
memory_add Extract and store facts from a conversation
memory_search Semantic search across stored memories
memory_get Retrieve a specific memory by ID
memory_delete Soft-delete a memory

Quickstart (Development)

1. Build

git clone https://github.com/nexusentis/engram.git
cd engram
cargo build --release

2. Set up Qdrant

Option A: Native binary (no Docker)

./scripts/setup-qdrant.sh
./bin/qdrant  # starts on :6333 (REST) + :6334 (gRPC)

Option B: Docker

docker compose up -d --wait

3. Configure

cp .env.example .env
# Edit .env and set OPENAI_API_KEY

4. Initialize

cargo run --bin engram -- init
cargo run --bin engram -- status

Crates

All crates are published to crates.io:

Crate Description
engram-ai-core Core library: types, storage, extraction, embedding, retrieval
engram-agent Reusable LLM agent loop with tool-calling and lifecycle hooks
engram-ai Convenience facade re-exporting engram-ai-core
engram-server REST + MCP server binary
engram-cli CLI binary (engram init, engram status)
cargo add engram-ai-core   # core library
cargo add engram-ai         # or the facade

Project Structure

engram/
├── crates/
│   ├── engram-ai-core/            # Core library
│   │   ├── src/
│   │   │   ├── api/               # HTTP + MCP API layers
│   │   │   ├── config/            # Configuration loading & validation
│   │   │   ├── embedding/         # Remote (OpenAI) embeddings
│   │   │   ├── extraction/        # LLM-based fact extraction pipeline
│   │   │   ├── retrieval/         # Hybrid search engine, RRF, reranking
│   │   │   ├── storage/           # Qdrant backend
│   │   │   ├── temporal/          # Temporal parsing & filtering
│   │   │   └── types/             # Core data types (Entity, Memory, Session)
│   │   └── tests/
│   ├── engram-agent/              # LLM agent loop
│   ├── engram-ai/                 # Facade crate
│   ├── engram-server/             # Server binary (MCP + HTTP modes)
│   └── engram-cli/                # CLI binary (init, status, config)
├── config/                        # TOML configs + MCP client examples
├── data/
│   └── longmemeval/               # Benchmark data & question sets
└── scripts/                       # Setup & utility scripts

Configuration

Configuration is via environment variables. See .env.example for common settings.

Key variables:

Variable Required Default Description
OPENAI_API_KEY Yes OpenAI API key
ENGRAM_QDRANT_URL No http://localhost:6334 Qdrant gRPC endpoint

Benchmark

Engram includes a full LongMemEval-S benchmark harness with 3 testing tiers:

Tier Questions Cost (Gemini) Use
Fast Loop 60 ~$4 Every tweak
Gate 231 ~$15 Before promoting a change
Truth 500 ~$30 Definitive score
# Load env vars
set -a; source .env; set +a

# Fast loop (60 questions)
BENCHMARK_CONFIG=config/benchmark.toml INGESTION=skip \
  QUESTION_IDS=@data/longmemeval/fast_60.txt \
  cargo test --release --test integration_benchmark test_validation_benchmark -- --ignored --nocapture

# Full truth run (500 questions)
BENCHMARK_CONFIG=config/benchmark.toml INGESTION=skip FULL_BENCHMARK=1 \
  cargo test --release --test integration_benchmark test_validation_benchmark -- --ignored --nocapture

Control ingestion with INGESTION=full|skip|incremental. Use INGESTION=skip to iterate on query-time changes without re-ingesting.

Documentation

  • Research — The full research narrative: from 0% to 95.8% across eleven phases, failed experiments, engineering discipline rules, and the path forward.
  • Developer Docs — API reference, configuration, and integration guides.

License

MIT

Dependencies

~31–51MB
~798K SLoC