4 releases
| new 0.2.1 | Mar 8, 2026 |
|---|---|
| 0.2.0 | Mar 8, 2026 |
| 0.1.1 | Mar 5, 2026 |
| 0.1.0 | Mar 5, 2026 |
#474 in Database interfaces
1MB
18K
SLoC
Engram
Rust-native AI agent memory system achieving 95.8% accuracy on LongMemEval-S — #1 globally, surpassing all known commercial systems.
Architecture
Conversation ──► Extraction ──► Qdrant ──► Retrieval ──► Answer
(gpt-4o-mini) (vector) (hybrid+RRF) (Gemini + GPT-5.2 ensemble)
Extraction breaks conversations into typed facts across 4 epistemic collections:
world— objective facts about the worldexperience— personal experiences and eventsopinion— preferences, beliefs, evaluationsobservation— behavioral patterns and inferred traits
Retrieval combines semantic search, keyword matching, temporal filtering, and entity-linked lookup, fused via Reciprocal Rank Fusion (RRF).
Features
- 5-dimensional fact extraction (entities, epistemic type, temporality, source, confidence)
- 4 epistemic Qdrant collections with type-specific schemas
- Hybrid retrieval: semantic + keyword + temporal + entity-linked
- RRF fusion with configurable k parameter
- Agentic answering loop with tool use
- MCP server for integration with Claude Desktop, Cursor, and other MCP clients
- Full LongMemEval-S benchmark harness with 3-tier testing
MCP Quickstart
Engram exposes an MCP server (engram-server --mode mcp) that any MCP-compatible client can use to store and query long-term memory.
1. Start Qdrant
docker compose up -d qdrant --wait
2. Build engram-server
cargo build --release -p engram-server
3. Configure your MCP client
Claude Desktop — copy config/claude_desktop_config.example.json into ~/Library/Application Support/Claude/claude_desktop_config.json and update the binary path and API key.
Cursor — copy config/cursor_mcp.example.json into your Cursor MCP settings and update the binary path and API key.
Docker — start Qdrant via Docker Compose, then point your MCP client at the local binary:
docker compose up -d qdrant --wait # start Qdrant only
# Then configure Claude Desktop / Cursor to run engram-server directly
Available Tools
| Tool | Description |
|---|---|
memory_add |
Extract and store facts from a conversation |
memory_search |
Semantic search across stored memories |
memory_get |
Retrieve a specific memory by ID |
memory_delete |
Soft-delete a memory |
Quickstart (Development)
1. Build
git clone https://github.com/nexusentis/engram.git
cd engram
cargo build --release
2. Set up Qdrant
Option A: Native binary (no Docker)
./scripts/setup-qdrant.sh
./bin/qdrant # starts on :6333 (REST) + :6334 (gRPC)
Option B: Docker
docker compose up -d --wait
3. Configure
cp .env.example .env
# Edit .env and set OPENAI_API_KEY
4. Initialize
cargo run --bin engram -- init
cargo run --bin engram -- status
Crates
All crates are published to crates.io:
| Crate | Description |
|---|---|
engram-ai-core |
Core library: types, storage, extraction, embedding, retrieval |
engram-agent |
Reusable LLM agent loop with tool-calling and lifecycle hooks |
engram-ai |
Convenience facade re-exporting engram-ai-core |
engram-server |
REST + MCP server binary |
engram-cli |
CLI binary (engram init, engram status) |
cargo add engram-ai-core # core library
cargo add engram-ai # or the facade
Project Structure
engram/
├── crates/
│ ├── engram-ai-core/ # Core library
│ │ ├── src/
│ │ │ ├── api/ # HTTP + MCP API layers
│ │ │ ├── config/ # Configuration loading & validation
│ │ │ ├── embedding/ # Remote (OpenAI) embeddings
│ │ │ ├── extraction/ # LLM-based fact extraction pipeline
│ │ │ ├── retrieval/ # Hybrid search engine, RRF, reranking
│ │ │ ├── storage/ # Qdrant backend
│ │ │ ├── temporal/ # Temporal parsing & filtering
│ │ │ └── types/ # Core data types (Entity, Memory, Session)
│ │ └── tests/
│ ├── engram-agent/ # LLM agent loop
│ ├── engram-ai/ # Facade crate
│ ├── engram-server/ # Server binary (MCP + HTTP modes)
│ └── engram-cli/ # CLI binary (init, status, config)
├── config/ # TOML configs + MCP client examples
├── data/
│ └── longmemeval/ # Benchmark data & question sets
└── scripts/ # Setup & utility scripts
Configuration
Configuration is via environment variables. See .env.example for common settings.
Key variables:
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY |
Yes | — | OpenAI API key |
ENGRAM_QDRANT_URL |
No | http://localhost:6334 |
Qdrant gRPC endpoint |
Benchmark
Engram includes a full LongMemEval-S benchmark harness with 3 testing tiers:
| Tier | Questions | Cost (Gemini) | Use |
|---|---|---|---|
| Fast Loop | 60 | ~$4 | Every tweak |
| Gate | 231 | ~$15 | Before promoting a change |
| Truth | 500 | ~$30 | Definitive score |
# Load env vars
set -a; source .env; set +a
# Fast loop (60 questions)
BENCHMARK_CONFIG=config/benchmark.toml INGESTION=skip \
QUESTION_IDS=@data/longmemeval/fast_60.txt \
cargo test --release --test integration_benchmark test_validation_benchmark -- --ignored --nocapture
# Full truth run (500 questions)
BENCHMARK_CONFIG=config/benchmark.toml INGESTION=skip FULL_BENCHMARK=1 \
cargo test --release --test integration_benchmark test_validation_benchmark -- --ignored --nocapture
Control ingestion with INGESTION=full|skip|incremental. Use INGESTION=skip to iterate on query-time changes without re-ingesting.
Documentation
- Research — The full research narrative: from 0% to 95.8% across eleven phases, failed experiments, engineering discipline rules, and the path forward.
- Developer Docs — API reference, configuration, and integration guides.
License
MIT
Dependencies
~31–51MB
~798K SLoC