Ultra-fast heap dump analysis, leak detection, code mapping, and AI-generated fixes — powered by Rust, LLMs, and the Model Context Protocol (MCP).
- Overview
- Key Features
- Architecture
- Installation
- Maintainer
- Usage
- MCP Integration
- Project Structure
- Performance
- Roadmap
- Contributing
- License
- Acknowledgements
Mnemosyne is a next-generation AI-assisted JVM memory analysis tool. It brings total clarity to complex Java/Kotlin heap dumps by combining:
- ⚡ High-performance Rust-based heap parsing
- 🧩 Object-graph parsing plus dominator-backed retained sizes, grouped histograms, unreachable-object analysis, and class-level diffing in the graph-backed analysis path
- 🧠 AI-generated explanations and heuristic fix guidance
- 🛠 Seamless IDE integration via the Model Context Protocol (MCP)
- 🧬 Code mapping, leak reproduction, forecasting, and more
Mnemosyne transforms .hprof heap dumps, GC logs, and thread dumps into actionable insights — giving you root cause analysis, memory leak detection, and guided solutions.
- Blazing-fast Rust-based
.hprofparser - Streaming I/O with low memory overhead
- Suitable for multi-gigabyte heap dumps
mnemosyne analyzeandmnemosyne leaksboth use graph-backed retained sizes when the object graph is available, then fall back to heuristics with provenance markersmnemosyne analyze --group-by class|package|classloadernow renders graph-backed histogram tables with instance, shallow-size, and retained-size totals, plus an unreachable-object summary when full parsing succeeds- Optional investigation reports now hang off the same graph-backed path:
mnemosyne analyze --threads --strings --collections --top-instancesadds per-thread retained-size views, duplicate-string analysis, collection waste inspection, and top-instance ranking in one run --top-nand--min-capacitylet you tune report depth and collection noise floor without changing the underlying analysis pipeline- Parse summaries and leak listings now render aligned terminal tables at the CLI boundary, with follow-up disclosure sections when width-bounded cells truncate long values
- Parse summaries describe heap record categories by aggregate bytes/share/entries so the lightweight view does not imply class-level retained-size semantics
- Authentic GC path finder now tries full
ObjectGraphBFS first, then budget-limited parsing, then synthetic fallback when needed - Shared object-graph model now lives under
core::hprof/, withcore::hprof::binary_parserandcore::graph::dominatorproviding an established graph-backed retained-size pipeline plus navigation APIs (get_object,get_references,get_referrers), typed field readers, and retained stack-trace metadata - Contextual CLI error messages now flag common wrong inputs, suggest nearby
.hproffiles when a path is missing, and surface config-fix hints for invalid TOML or bad config overrides
- Natural-language explanations for memory leaks
- Automatic detection of:
- Coroutine leaks
- Thread leaks
- HTTP client response leaks
- ClassLoader leaks
- Cache & collection leaks
mnemosyne analyzeandmnemosyne leaksboth attempt object-graph → dominator → retained-size analysis first, then fall back to heuristics with provenance markers when graph parsing is unavailable- Graph-backed suspect ranking now scores leaks using retained/shallow ratio, accumulation-point detection, dominated-object counts, and short reference-chain context while preserving backward compatibility via optional response fields
- AI-generated code fixes
- Leak reproduction snippet generator
- Maps leaked objects → source code lines
- Git-aware:
- blame
- commit introducer detection
- Works with Java & Kotlin projects
Fully integrated with:
- VS Code
- Cursor
- Zed
- JetBrains (via MCP plugin)
- ChatGPT Desktop
Available MCP commands:
- parse_heap
- detect_leaks
- map_to_code
- find_gc_path
- explain_leak
- propose_fix
- apply_fix
Mnemosyne becomes a Memory Debugging Copilot inside your editor.
Mnemosyne v0.2.0 (alpha) is now available. Tagged GitHub releases now publish prebuilt
mnemosyne-cliarchives for x86_64 Linux, aarch64 Linux, x86_64 macOS, aarch64 macOS, and x86_64 Windows.mnemosyne-coreandmnemosyne-cliare published on crates.io, the repository includes a Homebrew formula for macOS release archives, and tagged releases publish a Docker image toghcr.io/bballer03/mnemosyne.
The repository now includes a GitHub Actions CI workflow that runs workspace check, test, clippy, and fmt on pushes and pull requests, plus a release workflow that validates version tags, builds release archives for five targets, and publishes them on tagged releases.
Visit the repository's Releases page and download the archive for your platform from any v* tag release.
Both mnemosyne-core and mnemosyne-cli are published on crates.io.
Install the CLI with:
cargo install mnemosyne-cliThe repository now includes HomebrewFormula/mnemosyne.rb for tagged GitHub Release archives:
brew install ./HomebrewFormula/mnemosyne.rbThe formula selects Apple Silicon vs Intel archives automatically with Hardware::CPU.arm?.
Tagged releases now publish a container image to GHCR with version, major.minor, and latest tags:
# Use a specific version tag instead of :latest for reproducibility
docker pull ghcr.io/bballer03/mnemosyne:0.2.0
# Parse a heap dump
docker run --rm -v /path/to/dumps:/data:ro ghcr.io/bballer03/mnemosyne:0.2.0 parse /data/heap.hprof
# Analyze a heap dump
docker run --rm -v /path/to/dumps:/data:ro ghcr.io/bballer03/mnemosyne:0.2.0 analyze /data/heap.hprof
# Detect leaks
docker run --rm -v /path/to/dumps:/data:ro ghcr.io/bballer03/mnemosyne:0.2.0 leaks /data/heap.hprofThe image runs as a non-root user, uses /data as its working directory, and sets mnemosyne-cli as the entrypoint so heap dumps can be mounted directly into the container.
git clone https://github.com/bballer03/mnemosyne
cd mnemosynecargo build --releaseexport OPENAI_API_KEY="your-api-key-here"
# or use a .env file
echo "OPENAI_API_KEY=your-api-key-here" > .envMnemosyne automatically looks for additional settings in the following order:
--config /path/to/file.toml(explicit CLI flag)$MNEMOSYNE_CONFIGenvironment variable.mnemosyne.tomlin the current working directory~/.config/mnemosyne/config.toml/etc/mnemosyne/config.toml
Run mnemosyne config to inspect the effective configuration and where it was loaded from.
Need consistent leak filtering defaults for every command? Add an [analysis] block to your config so mnemosyne leaks, analyze, and explain all share the same thresholds:
[analysis]
min_severity = "MEDIUM"
packages = ["com.example", "org.demo"]
leak_types = ["CACHE", "THREAD", "HTTP_RESPONSE"]
accumulation_threshold = 10.0CLI flags such as --min-severity or --package still win, but the config keeps the day-one experience aligned across local runs, CI, and MCP.
Leaks below min_severity are filtered out, so noisy low-priority signals stay out of your reports.
accumulation_threshold controls when a graph-backed suspect is flagged as an accumulation point. Lower values make Mnemosyne more aggressive about highlighting small objects that retain disproportionately large subgraphs.
When you specify multiple packages, Mnemosyne first treats them as an allow-list for real classes (only matching histograms become leak candidates) and then rotates through them as it synthesizes fallback identifiers so each category stays easy to trace back to its service/module.
Prefer shell overrides? Export MNEMOSYNE_MIN_SEVERITY, MNEMOSYNE_PACKAGES, and MNEMOSYNE_LEAK_TYPES before running the CLI to apply the same defaults without a file.
The packaged binary name is mnemosyne-cli:
./target/release/mnemosyne-cli parse heap.hprofMnemosyne is maintained by bballer03. GitHub repository: https://github.com/bballer03/mnemosyne
mnemosyne parse heap.hprofExample output:
✓ Parsed heap dump.
Heap path: heap.hprof
File size: 2.40 GB
Format: JAVA PROFILE 1.0.2 | Identifier bytes: 8 | Timestamp(ms): 1709836800000
Estimated objects: 1234567
Total HPROF records: 5678901
Top heap record categories by aggregate bytes:
# Record Category Bytes Share Entries
1 INSTANCE_DUMP 421.00 MB 50.1% 345678
2 PRIMITIVE_ARRAY_DUMP 312.00 MB 37.1% 234567
3 OBJECT_ARRAY_DUMP 89.00 MB 10.6% 67890
4 CLASS_DUMP 12.00 MB 1.4% 4321
5 HEAP_DUMP_SEGMENT 7.50 MB 0.9% 12
Top record tags:
Record Tag Hex Entries Size
HEAP_DUMP_SEGMENT 0x1C 12 841.50 MB
STRING_IN_UTF8 0x01 89012 15.00 MB
LOAD_CLASS 0x02 4321 0.50 MB
STACK_TRACE 0x05 1234 0.20 MB
STACK_FRAME 0x04 5678 0.10 MB
Those numbers come straight from Mnemosyne's lightweight record-stat histogram derived from the HPROF stream. The parse view is intentionally record-category oriented, while richer class- and object-level retained-size semantics still live in the graph-backed analysis path. If a record-category label is truncated to fit the terminal table, Mnemosyne prints a follow-up disclosure section with the full value.
mnemosyne leaks heap.hprofExample output:
✓ Leak detection complete.
Potential leaks:
Leak ID Class Kind Severity Retained Instances
leak-usersession-1 com.example.UserSessionCache Cache High 512.00 MB 125432
leak-okhttp-1 okhttp3.Response Resource Medium 89.00 MB 8921
Leak: leak-usersession-1
Description: Cache growing unbounded, cleanup thread blocked
Provenance:
[SYNTHETIC] generated from histogram heuristics
Leak: leak-okhttp-1
Description: Unclosed HTTP response bodies
Provenance:
[SYNTHETIC] generated from histogram heuristics
Long leak IDs and class names stay recoverable even when the terminal table bounds cell width; Mnemosyne emits disclosure sections immediately after the table whenever truncation occurs.
Tune the heuristics per run with --leak-kind. Repeat the flag (or provide a comma list) to emit one synthetic entry per requested category:
mnemosyne leaks heap.hprof --leak-kind cache --leak-kind threadNeed to scope results to multiple namespaces? --package accepts comma-delimited or repeated values:
mnemosyne leaks heap.hprof --package com.example --package org.demoUnder the hood Mnemosyne now filters real class stats with those package prefixes before it ever synthesizes candidates, which keeps the CLI/MCP output focused on the code you actually own.
Both mnemosyne leaks and mnemosyne analyze now attempt graph-backed analysis first, then fall back to heuristics with explicit provenance when the heap dump lacks enough object-graph detail. mnemosyne analyze additionally surfaces dominator metrics and richer graph detail in its report output.
When graph-backed analysis succeeds, mnemosyne analyze can also print a grouped histogram (--group-by class|package|classloader), an unreachable-object summary, optional thread/string/collection/top-instance reports, and mnemosyne diff augments the existing record-level comparison with class-level retained-size deltas.
If no candidates survive filtering, mnemosyne leaks now prints No leak suspects detected. so zero-result runs are explicit instead of silent.
mnemosyne map leak-com.example.MemoryKeeper::d34db33f --project-root ./your-service --class com.example.MemoryKeeperExample output:
Source candidates for `com.example.MemoryKeeper::d34db33f`:
- ./your-service/src/main/java/com/example/MemoryKeeper.java:3 (public class MemoryKeeper)
public class MemoryKeeper {
void retain() {}
}
mnemosyne explain heap.hprof --leak-id com.example.UserSessionCache::deadbeefExample output:
Model: gpt-4.1-mini (confidence 78%)
com.example.UserSessionCache is retaining ~512.00 MB via 125432 instances; prioritize freeing it to reclaim 21.0% of the heap.
Recommendations:
- Guard com.example.UserSessionCache lifetimes: ensure cleanup hooks dispose unused entries.
- Add targeted instrumentation (counters, timers) around the suspected allocation sites.
- Review threading / coroutine lifecycles anchoring these objects to a GC root.
Behind the scenes Mnemosyne packages every AI prompt/response in TOON for deterministic machine parsing. The CLI still prints conversational text, but automation can read the structured transcript via
analysis.ai.wire(or the MCPexplain_leakresponse) to forward the TOON payload to a real LLM.
mnemosyne fix heap.hprof --leak-id com.example.UserSessionCache::deadbeef --style defensive --project-root ./your-serviceExample output:
Fix for com.example.UserSessionCache [com.example.UserSessionCache::deadbeef] (Defensive, confidence 72%):
File: ./your-service/src/main/java/com/example/UserSessionCache.java
Wrap com.example.UserSessionCache allocations in try-with-resources / finally blocks to avoid lingering references.
Patch:
--- a/./your-service/src/main/java/com/example/UserSessionCache.java
+++ b/./your-service/src/main/java/com/example/UserSessionCache.java
@@ public void retain(...)
-Resource r = allocator.acquire();
+try (Resource r = allocator.acquire()) {
+ // existing logic
+}
mnemosyne gc-path heap.hprof --object-id 0x7f8a9c123456 --max-depth 5Example output:
GC path for 0x0000000033333333:
#0 ROOT -> com.example.Leaky [0x0000000044444444] via ROOT Unknown
#1 -> java.lang.Object [0x0000000033333333] via leakyField
Mnemosyne now resolves GC paths by trying full ObjectGraph BFS first via trace_on_object_graph(), then a budget-limited GcGraph, and only then a synthetic path if the heap dump omits the needed records or exceeds the configured sampling budget.
mnemosyne analyze heap.hprof --ai --threads --strings --collections --top-instances --top-n 10 --min-capacity 32Example output:
🧠 AI Analysis:
Root Cause: UserSessionCache is retaining stale sessions because the
cleanup thread is deadlocked waiting on a monitor lock held by the
main request handler thread.
Recommendation:
1. Add timeout to cache.cleanup() method
2. Use ConcurrentHashMap instead of synchronized HashMap
3. Consider using weak references for session storage
Code Fix Available: Run 'mnemosyne fix heap.hprof' to generate patch
When --ai is enabled, the CLI and reports include an AI Insights block that summarizes the suspected root cause, model confidence, and recommended remediation steps. This currently uses deterministic heuristics so the UX stays consistent offline.
Need deeper investigation without switching tools? The same analyze run can now append thread-retention tables, duplicate-string groups, oversized-collection summaries, and the largest retained instances via --threads, --strings, --collections, and --top-instances.
mnemosyne analyze heap.hprof --format toon > report.toonPrefer a machine-readable JSON artifact? Swap in --format json --output-file report.json. The CLI writes every report to stdout by default, but --output-file lets you persist HTML/Markdown/TOON/JSON without juggling shell redirection.
Example TOON payload:
TOON v1
section summary
heap=heap.hprof
objects=1234567
bytes=2453291008
size_gb=2.29
graph_nodes=321
leak_count=1
section leaks
leak#0
id=com.example.UserSessionCache::deadbeef
class=com.example.UserSessionCache
kind=Cache
severity=High
retained_mb=512.00
instances=125432
description=UserSessionCache dominates 21% of the heap via stale sessions
section dominators
dominator#0
name=com.example.UserSessionCache
parent=<heap-root>
descendants=642
retained_mb=512.00
section ai
model=gpt-4.1-mini
confidence_pct=78
summary=UserSessionCache retains ~512 MB because cleanup threads stalled; freeing it would reclaim 21% of the heap.
rec#0
text=Guard UserSessionCache lifetimes with auto-expire entries
rec#1
text=Instrument cleanup thread health
# Quick analysis
mnemosyne analyze heap.hprof
# Verbose output with debug info
mnemosyne analyze heap.hprof -v
# Filter by package
mnemosyne leaks heap.hprof --package com.example
# Specify multiple packages (comma or repeated flag)
mnemosyne analyze heap.hprof --package com.example,org.demo
# Focus on selected leak kinds
mnemosyne analyze heap.hprof --leak-kind cache,thread
# Group the analysis histogram by package
mnemosyne analyze heap.hprof --group-by package
# Add investigation reports to the same analysis run
mnemosyne analyze heap.hprof --threads --strings --collections --top-instances --top-n 15 --min-capacity 32
# Export HTML report
mnemosyne analyze heap.hprof --format html --output-file report.html
# Emit JSON for CI
mnemosyne analyze heap.hprof --format json --output-file report.json
# Compare two heap dumps
mnemosyne diff before.hprof after.hprof
# Map leak to code
mnemosyne map leak-foo --project-root ./service --class com.example.MemoryKeeper
# Explain a specific leak
mnemosyne explain heap.hprof --leak-id com.example.UserSessionCache::deadbeef
# Generate a defensive fix patch
mnemosyne fix heap.hprof --leak-id com.example.UserSessionCache::deadbeef --style defensive
# Trace GC path
mnemosyne gc-path heap.hprof --object-id 0x7f8a9c123456 --max-depth 4
# Inspect effective config (and source)
mnemosyne config --config ./ops/prod.toml$ mnemosyne diff before.hprof after.hprof
Heap diff: before.hprof -> after.hprof
Delta size: +128.00 MB
Delta objects: +12500
Top changes:
- INSTANCE_DUMP: +96.00 MB (before 384.00 MB -> after 480.00 MB)
- OBJECT_ARRAY_DUMP: +32.00 MB (before 256.00 MB -> after 288.00 MB)
Class-level retained deltas:
Class Instances Shallow Retained Delta
com.example.UserSessionCache +820 45.00 -> 97.00 MB +213.00 MB
java.lang.String +410 398.00 -> 421.00 MB +44.00 MBThe diff command still preserves the fast record/class summary comparison, and when both snapshots build object graphs it now also reports class-level instance, shallow-byte, and retained-byte deltas. Use it inside CI (see docs/examples) to fail builds when heap growth crosses your budget.
Mnemosyne integrates seamlessly with MCP-compatible AI clients.
Edit or create .vscode/mcp-config.json:
{
"mcpServers": {
"mnemosyne": {
"command": "/path/to/mnemosyne",
"args": ["serve"],
"env": {
"OPENAI_API_KEY": "${env:OPENAI_API_KEY}"
}
}
}
}Edit ~/.config/zed/settings.json:
{
"mcp": {
"servers": {
"mnemosyne": {
"command": "mnemosyne",
"args": ["serve"],
"env": {
"OPENAI_API_KEY": "${env:OPENAI_API_KEY}"
}
}
}
}
}Edit ~/Library/Application Support/ChatGPT/mcp_config.json (macOS):
{
"mnemosyne": {
"command": "mnemosyne",
"args": ["serve"],
"env": {
"OPENAI_API_KEY": "${env:OPENAI_API_KEY}"
}
}
}Once configured, you can ask your AI assistant:
Tip:
mnemosyne servereads the same configuration chain as the CLI. Supply--config, set$MNEMOSYNE_CONFIG, or drop a.mnemosyne.tomlnext to your heap dumps so MCP sessions inherit your[analysis], AI, and output defaults automatically.
- "Analyze heap.hprof and show me the root cause."
- "Open the file responsible for the retained objects."
- "Find all coroutine leaks in the heap dump."
- "Generate a fix for the memory leak in UserSessionCache."
- "Show me the git blame for the method that introduced this leak."
| Command | Description |
|---|---|
parse_heap |
Parse a heap dump and return summary |
detect_leaks |
Detect memory leaks with severity levels |
map_to_code |
Map leaked objects to source code locations |
find_gc_path |
Find path from object to GC root |
explain_leak |
Get AI explanation for detected leak |
propose_fix |
Generate code fix suggestions |
apply_fix |
Apply fix to source code |
mnemosyne/
│
├── core/
│ ├── src/
│ │ ├── hprof/ # HPROF parsing (streaming + binary + object graph)
│ │ ├── graph/ # Dominator tree, GC paths, graph metrics
│ │ ├── analysis/ # Leak detection, investigation analyzers, and AI insights
│ │ ├── mapper/ # Source code mapping + Git integration
│ │ ├── report/ # Report rendering (Text/MD/HTML/TOON/JSON)
│ │ ├── fix/ # Fix generation
│ │ ├── mcp/ # MCP JSON-RPC server
│ │ ├── config.rs # Configuration types
│ │ ├── errors.rs # CoreError types
│ │ └── lib.rs # Public API re-exports
│
├── cli/
│ └── src/main.rs # CLI entry point
│
├── resources/
│ └── test-fixtures/ # Fixture documentation for parser/graph tests
│
├── .github/
│ └── workflows/ci.yml # GitHub Actions workspace validation
│
├── web/# (Future) WASM/Web dashboard
│
└── Cargo.toml
Mnemosyne is built for speed and efficiency:
Captured: 2026-03-09 on a 156 MB Kotlin + Spring Boot heap dump (
resources/test-fixtures/heap.hprof)
Measured results (Criterion, release profile):
| Benchmark | Median | Throughput | Notes |
|---|---|---|---|
| Streaming parser (real heap) | 67.2 ms | 2.25 GiB/s | Record-level scanning, 5.12 MiB peak RSS in the latest Step 11 run |
| Binary parser -> ObjectGraph | 1.71 s | 90.5 MiB/s | Full graph construction |
| Dominator tree (Lengauer-Tarjan) | 1.85 s | -- | Full retained-size computation |
| Graph reference lookup | 26.7 ns | -- | Single-object outgoing refs |
| Retained-size top-N query | 712 us | -- | On real 156 MB heap |
Memory usage (RSS:dump ratio):
| Command | Peak RSS | Ratio | Notes |
|---|---|---|---|
parse (streaming) |
5.12 MiB | 0.03x | Minimal memory, scales to any dump size |
analyze (default) |
656.65 MiB | 4.23x | Lean graph-backed path; slightly above the original 4x target |
leaks (default) |
656.46 MiB | 4.23x | Same lean graph-backed path with real dominator-backed retained sizes |
analyze --strings --threads --collections |
~741 MiB | 4.78x | Opt-in field-data retention for investigation analyzers |
Full benchmark details: docs/performance/memory-scaling.md
Default graph-backed runs now keep raw field retention disabled unless thread, string, or collection investigation is requested. That split is implemented through ParseOptions, which keeps the higher-memory path opt-in instead of making every analyze or leaks run pay the full Phase 2 cost.
- Streaming parsing: BufReader-based sequential processing with low memory overhead
- Rust performance: Near-C speeds with memory safety guarantees
- Streaming architecture: Processes dumps larger than available RAM
- Conditional field retention: default
analyze/leaksstay on a lean parser path while deeper investigation flags opt into raw field retention only when needed - Efficient graph algorithms:
petgraphwith Lengauer-Tarjan dominator computation
- Rust heap dump parser
- Dominator tree
- Basic leak detection
- CLI + MCP server
- AI explanations
- Source code mapping
- Full IDE integration
- AI auto-fixes
- Leak reproduction generator
- PR leak detection & CI integration
- JVM Agent
- Memory growth forecasting
- GC log + thread dump correlation
- Web dashboard
- WASM-based in-browser analyzer
- GPU-accelerated graph computation
We welcome contributions from the community! Whether it's:
- 🐛 Bug reports
- 💡 Feature requests
- 📝 Documentation improvements
- 🔧 Code contributions
Please see our CONTRIBUTING.md for:
- Development setup
- Coding standards
- Testing guidelines
- Pull request process
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
cargo test) - Commit with a descriptive message (see .github/copilot-instructions.md for commit style)
- Push to your fork
- Open a Pull Request
For major changes, please open an issue first to discuss what you'd like to change.
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Mnemosyne is built to simplify JVM memory debugging by combining Rust performance with AI intelligence. It aims to make heap analysis faster, smarter, and more accessible.
- Quick Start Guide - Get started in 5 minutes
- Architecture - Detailed system design
- API Reference - MCP API documentation
- Configuration Guide - All configuration options
- Contributing - How to contribute
- Examples - Usage examples and scripts
- Changelog - Version history
- Status - Snapshot of shipped vs. planned functionality