#sql #acid #embedded-database #mvcc

bin+lib stoolap

High-performance embedded SQL database with MVCC, time-travel queries, and full ACID compliance

8 releases

new 0.3.1 Feb 18, 2026
0.3.0 Feb 15, 2026
0.2.4 Jan 24, 2026
0.1.0 Dec 12, 2025

#886 in Database interfaces

Apache-2.0

6.5MB
125K SLoC

Stoolap Logo

A Modern Embedded SQL Database in Pure Rust

WebsiteDocumentationReleasesBenchmarks

CI codecov Crates.io GitHub release License


Why Stoolap?

Stoolap is a feature-rich embedded SQL database with capabilities that rival established databases like PostgreSQL and DuckDB - all in a single dependency with zero external requirements.

Performance

Stoolap is optimized for OLTP workloads: point queries, transactional updates, and real-time analytics. It uses parallel execution via Rayon for large scans and a cost-based optimizer for query planning.

See BENCHMARKS.md for detailed comparisons against SQLite and DuckDB.

Unique Features

Feature Stoolap SQLite DuckDB PostgreSQL
AS OF Time-Travel Queries ❌*
MVCC Transactions
Cost-Based Optimizer
Adaptive Query Execution
Semantic Query Caching
Parallel Query Execution
Pure Rust (Memory Safe)
No C/C++ Required

*PostgreSQL requires extensions for temporal queries


Quick Start

Installation

Add to your Cargo.toml:

[dependencies]
stoolap = "0.3"

Or build from source:

git clone https://github.com/stoolap/stoolap.git
cd stoolap
cargo build --release

Library Usage

use stoolap::api::Database;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // In-memory database
    let db = Database::open_in_memory()?;

    // Or persistent storage
    // let db = Database::open("file:///path/to/data")?;

    // Create table
    db.execute("CREATE TABLE users (
        id INTEGER PRIMARY KEY,
        name TEXT NOT NULL,
        email TEXT UNIQUE,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    )", ())?;

    // Insert with parameters
    db.execute("INSERT INTO users (id, name, email) VALUES (?, ?, ?)",
        (1, "Alice", "alice@example.com"))?;

    // Query with iteration
    for row in db.query("SELECT * FROM users WHERE id = ?", (1,))? {
        let row = row?;
        println!("User: {} <{}>",
            row.get::<String>(1)?,  // name
            row.get::<String>(2)?   // email
        );
    }

    Ok(())
}

Command Line Interface

# Interactive REPL (in-memory)
./stoolap

# Persistent database
./stoolap --db "file:///var/lib/stoolap/data"

# Execute query directly
./stoolap -e "SELECT version()"

# Execute SQL file
./stoolap --db "file://./mydb" < schema.sql

Features

MVCC Transactions

Full multi-version concurrency control with isolation levels:

-- Read Committed (default)
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- Snapshot Isolation (repeatable reads)
BEGIN TRANSACTION ISOLATION LEVEL SNAPSHOT;
SELECT * FROM accounts;  -- Consistent view throughout transaction
COMMIT;

Time-Travel Queries

Query historical data at any point in time - a feature typically only found in enterprise databases:

-- Query data as it existed at a specific timestamp
SELECT * FROM orders AS OF TIMESTAMP '2024-01-15 10:30:00';

-- Query data as of a specific transaction
SELECT * FROM inventory AS OF TRANSACTION 1234;

-- Compare current vs historical data
SELECT
    c.price AS current_price,
    h.price AS old_price,
    c.price - h.price AS change
FROM products c
JOIN products AS OF TIMESTAMP '2024-01-01 00:00:00' h ON c.id = h.id
WHERE c.price != h.price;

Smart Indexes

Automatic index type selection based on data characteristics:

-- B-tree (auto-selected for INTEGER, FLOAT, TIMESTAMP)
-- Best for: range queries, sorting, prefix matching
CREATE INDEX idx_date ON orders(created_at);
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';

-- Hash (auto-selected for TEXT, JSON)
-- Best for: O(1) equality lookups
CREATE INDEX idx_email ON users(email);
SELECT * FROM users WHERE email = 'alice@example.com';

-- Bitmap (auto-selected for BOOLEAN)
-- Best for: low-cardinality columns, efficient AND/OR
CREATE INDEX idx_status ON orders(status) USING BITMAP;

-- Multi-column composite indexes
CREATE INDEX idx_lookup ON events(user_id, event_type);
CREATE UNIQUE INDEX idx_unique ON orders(customer_id, order_date);

Window Functions

Full analytical query support:

SELECT
    employee_name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
    LAG(salary) OVER (ORDER BY hire_date) AS prev_salary,
    AVG(salary) OVER (PARTITION BY department) AS dept_avg,
    SUM(salary) OVER (ORDER BY hire_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM employees;

Common Table Expressions (CTEs)

Including recursive queries for hierarchical data:

-- Recursive CTE: organizational hierarchy
WITH RECURSIVE org_chart AS (
    -- Base case: top-level managers
    SELECT id, name, manager_id, 1 AS level
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case: employees under managers
    SELECT e.id, e.name, e.manager_id, oc.level + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart ORDER BY level, name;

Advanced Aggregations

-- ROLLUP: hierarchical subtotals
SELECT region, product, SUM(sales)
FROM sales GROUP BY ROLLUP(region, product);

-- CUBE: all dimension combinations
SELECT region, product, SUM(sales)
FROM sales GROUP BY CUBE(region, product);

-- GROUPING SETS: custom combinations
SELECT region, product, SUM(sales)
FROM sales GROUP BY GROUPING SETS ((region, product), (region), ());

Subqueries

Scalar, correlated, EXISTS, IN, ANY/ALL:

-- Correlated subquery
SELECT * FROM employees e
WHERE salary > (
    SELECT AVG(salary) FROM employees
    WHERE department = e.department
);

-- EXISTS with correlation
SELECT * FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id AND o.amount > 1000
);

Query Optimizer

PostgreSQL-style cost-based optimizer with runtime adaptation:

-- Collect statistics for better query plans
ANALYZE orders;
ANALYZE customers;

-- View query plan with cost estimates
EXPLAIN SELECT * FROM orders WHERE customer_id = 100;

-- View plan with actual execution statistics
EXPLAIN ANALYZE
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.country = 'US';

Data Types

Type Description Example
INTEGER 64-bit signed integer 42, -100
FLOAT 64-bit floating point 3.14, -0.001
TEXT UTF-8 string 'hello', '日本語'
BOOLEAN true/false TRUE, FALSE
TIMESTAMP Date and time '2024-01-15 10:30:00'
JSON JSON data '{"key": "value"}'

Built-in Functions (110)

String Functions

UPPER, LOWER, LENGTH, CHAR_LENGTH, CHAR, TRIM, LTRIM, RTRIM, CONCAT, CONCAT_WS, SUBSTRING, SUBSTR, REPLACE, REVERSE, LEFT, RIGHT, LPAD, RPAD, REPEAT, POSITION, STRPOS, LOCATE, INSTR, SPLIT_PART

Math Functions

ABS, CEIL, CEILING, FLOOR, ROUND, TRUNCATE, TRUNC, SQRT, POWER, POW, MOD, SIGN, GREATEST, LEAST, EXP, LN, LOG, LOG10, LOG2, SIN, COS, TAN, PI, RANDOM

Date/Time Functions

NOW, CURRENT_DATE, CURRENT_TIMESTAMP, EXTRACT, DATE_TRUNC, TIME_TRUNC, DATE_ADD, DATE_SUB, DATEDIFF, DATE_DIFF, YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, TO_CHAR, VERSION

JSON Functions

JSON_EXTRACT, JSON_TYPE, JSON_TYPEOF, JSON_VALID, JSON_KEYS, JSON_ARRAY, JSON_ARRAY_LENGTH, JSON_OBJECT

Aggregate Functions

COUNT, SUM, AVG, MIN, MAX, MEDIAN, STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, VAR_SAMP, STRING_AGG, GROUP_CONCAT, ARRAY_AGG, FIRST, LAST

Window Functions

ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE, PERCENT_RANK, CUME_DIST

Utility Functions

COALESCE, NULLIF, IFNULL, CAST, IIF, TYPEOF, COLLATE, SLEEP


Storage & Persistence

# In-memory (fastest, data lost on exit)
./stoolap --db "memory://"

# File-based (durable storage with WAL)
./stoolap --db "file:///var/lib/stoolap/data"

Durability features:

  • Write-Ahead Logging (WAL): All changes logged before applied
  • Periodic Snapshots: Fast recovery from crashes
  • Index Persistence: All indexes saved and restored automatically

Architecture

src/
├── api/           # Public API (Database, Connection, Rows)
├── core/          # Core types (Value, Row, Schema, Error)
├── parser/        # SQL lexer and parser
├── optimizer/     # Cost-based query optimizer
│   ├── cost.rs        # Cost model with I/O and CPU costs
│   ├── join.rs        # Join optimization (dynamic programming)
│   ├── bloom.rs       # Bloom filter propagation
│   └── aqe.rs         # Adaptive query execution
├── executor/      # Query execution engine
│   ├── operators/     # Volcano-style operators
│   ├── parallel.rs    # Parallel execution (Rayon)
│   └── expression/    # Expression VM
├── functions/     # 110 built-in functions
│   ├── scalar/        # String, math, date, JSON
│   ├── aggregate/     # COUNT, SUM, AVG, etc.
│   └── window/        # ROW_NUMBER, RANK, LAG, etc.
└── storage/       # Storage engine
    ├── mvcc/          # Multi-version concurrency control
    └── index/         # B-tree, Hash, Bitmap indexes

Development

Building

cargo build              # Debug build
cargo build --release    # Optimized release build

Testing

cargo nextest run        # Run all tests (recommended)
cargo test               # Standard test runner

Code Quality

cargo clippy --all-targets --all-features -- -D warnings
cargo fmt --check

Documentation

cargo doc --open         # Generate and open API docs

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.


License

Apache License 2.0. See LICENSE for details.

Dependencies

~13–39MB
~474K SLoC