1 unstable release
| 0.1.0 | Oct 9, 2025 |
|---|
#600 in FFI
Used in rgen-tool
315KB
7.5K
SLoC
ggen - Ontology-Driven Code Generation
Transform RDF ontologies into reproducible code through SPARQL queries and Tera templates.
π What's New in v6
Version 6.0 brings manufacturing-grade quality control, AI-native workflows, and complete infrastructure generation to the ggen ecosystem.
Feature Highlights
- π¦ Poka-Yoke Error-Proofing: Manufacturing-grade quality gates prevent defects before they happen, with automatic SLO enforcement and andon signals
- π€ ggen-ai: AI-Native Code Generation: GPT-4 and Claude integration for intelligent template rendering, semantic validation, and conversational workflows
- βοΈ ggen-paas: Infrastructure-as-Code: Generate complete cloud infrastructure (Terraform, Kubernetes, Docker) directly from RDF ontologies
- π KNHK Systems: ETL + Provenance: Knowledge graphs with full lineage tracking, temporal reasoning, and data pipeline orchestration
- π Bree Scheduler: Job Orchestration: Cron-compatible async job scheduling with dependency graphs and failure recovery
- π Self-Hosting: ggen generates ggen: The ultimate proof - ggen now generates its own documentation, tests, and infrastructure
- π 20+ Examples: Production Patterns: Complete real-world examples including REST APIs, GraphQL servers, event sourcing, and microservices
At-a-Glance Statistics
- 92 commits since v5.1.0 with comprehensive feature additions
- 56,766 net lines added across the entire codebase
- 97% waste reduction achieved through specification-driven development
- 45 seconds average time from RDF spec to working, tested proof
- 100% determinism guaranteed - same input always produces identical output
Key Improvements
- Manufacturing-Grade Quality Control: Borrowed from Toyota Production System, ggen v6 enforces quality gates, timeout SLOs, and fail-fast validation
- AI-Powered Development Workflows: Integrate LLM reasoning directly into code generation for smarter templates and context-aware validation
- Complete Infrastructure Generation: Generate not just application code, but entire deployment pipelines, infrastructure definitions, and operational tooling
- Zero Manual Coding with Self-Hosting: ggen v6 generates its own documentation, proving the viability of 100% specification-driven development
- Educational Examples for All Use Cases: Learn from production-grade patterns spanning web frameworks, databases, messaging systems, and cloud platforms
Quick Links
- Feature Deep Dives - Detailed guides for each v6 feature
- Migration from v5.1.0 - Upgrade path and breaking changes
- Examples Showcase - 20+ working examples
- Full Documentation - Complete reference and tutorials
What is ggen?
ggen is a deterministic code generator that bridges semantic web technologies (RDF, SPARQL, OWL) with modern programming languages. Define your domain model once as an RDF ontology, and ggen generates type-safe code across multiple languages.
Why RDF Ontologies?
- Single Source of Truth: Define your data model once, generate everywhere
- Semantic Validation: Use OWL constraints and SHACL shapes to catch errors at generation time
- Intelligent Inference: SPARQL CONSTRUCT queries materialize implicit relationships
- Deterministic: Same ontology + templates = identical output every time
- Language-Agnostic: Generate Rust, TypeScript, Python, Go, and more from one source
Perfect For
- API Development: Generate client libraries and servers from API specifications
- Data Modeling: Keep microservices synchronized across your architecture
- Multi-Language Projects: Sync Rust backends with TypeScript frontends
- Domain-Driven Design: Generate code from domain ontologies
- Academic & Financial: Research projects requiring semantic validation
Quick Start (5 Minutes)
Installation
macOS/Linux (Fastest):
brew install seanchatmangpt/ggen/ggen
ggen --version # Should show: ggen 6.0.0+
Any Platform (Docker):
docker pull seanchatman/ggen:6.0.0
docker run --rm -v $(pwd):/workspace seanchatman/ggen:6.0.0 sync
From Source (Rust):
# Core features only (fastest)
cargo install ggen-cli
# With PaaS infrastructure generation
cargo install ggen-cli --features paas
# With AI-powered generation (GPT-4, Claude)
cargo install ggen-cli --features ai,paas
# Full feature set (AI + PaaS + experimental)
cargo install ggen-cli --features full
Feature Flags Explained:
paas: Generate Docker, Kubernetes, Terraform from RDF specsai: Enable GPT-4 and Claude integration for intelligent templatingfull: All features including experimental capabilities
Your First ggen v6 Project (5 minutes)
Note: Same workflow as v5.1.0, but now with error-proofing and quality gates!
Step 1: Create a minimal ontology (schema/Person.ttl):
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <https://example.com/> .
ex:Person a rdfs:Class ;
rdfs:label "Person" ;
rdfs:comment "A person in the system" .
ex:name a rdf:Property ;
rdfs:domain ex:Person ;
rdfs:range xsd:string ;
rdfs:label "Full name" .
ex:email a rdf:Property ;
rdfs:domain ex:Person ;
rdfs:range xsd:string ;
rdfs:label "Email address" .
Step 2: Create configuration (ggen.toml):
[project]
name = "my-first-app"
version = "0.1.0"
[ontology]
source = "schema/"
[generation]
output_dir = "src/generated"
Step 3: Add a Tera template (templates/struct.tera):
{%- for class in classes %}
#[derive(Debug, Clone)]
pub struct {{ class.name }} {
{%- for prop in class.properties %}
pub {{ prop.name }}: String,
{%- endfor %}
}
{%- endfor %}
Step 4: Generate code:
ggen sync
v6 Output with quality gates:
π’ Specification validation: PASSED
π’ Template compilation: PASSED
π’ Code generation: PASSED
β Generated: src/generated/struct.rs
Result in src/generated/struct.rs:
#[derive(Debug, Clone)]
pub struct Person {
pub name: String,
pub email: String,
}
Alternative Quick Starts
Option A: Traditional (RDF β Code)
Follow the 5-minute tutorial above. Perfect for learning the core ggen workflow.
Option B: AI-Powered (English β RDF β Code)
Requires --features ai:
# Describe your domain in plain English
ggen ai create "A blog with posts, authors, and comments"
# Generates RDF ontology automatically
# Then generates code from the ontology
ggen sync
Output: Complete blog domain model with type-safe Rust structs, validated relationships, and generated documentation.
Option C: Infrastructure (RDF β Docker/K8s/Terraform)
Requires --features paas:
# Start with any RDF ontology
ggen paas generate-docker schema/
ggen paas generate-k8s schema/
ggen paas generate-terraform schema/
# Or all at once
ggen paas generate-all schema/
Output: Production-ready deployment configurations with health checks, resource limits, and observability.
What's New in v6?
- Quality Gates: Validates specifications before generation (prevents 90%+ of errors)
- Andon Signals: Visual π’ GREEN / π‘ YELLOW / π΄ RED status for every operation
- SLO Enforcement: Generation completes in <5s with automatic timeout protection
- AI Integration: GPT-4 and Claude can now write and validate your RDF specs
- Infrastructure Gen: Generate complete cloud deployments from domain models
Next Steps
- Learn SPARQL Patterns: 8 Interactive Tutorials - master ontology queries
- Explore AI Generation: ggen-ai Guide - natural language to code
- Generate Infrastructure: ggen-paas Guide - deployment automation
- See 20+ Examples: Production Patterns - REST APIs, GraphQL, microservices
- Understand Philosophy: Big Bang 80/20 - specification-driven development
LLM-Construct Pattern
The LLM-Construct pattern automatically generates constraint-aware DSPy modules from OWL ontologies like FIBO (Financial Industry Business Ontology).
Quick Start
1. Define your domain in OWL:
@prefix : <http://example.com/bond#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
:Bond a owl:Class ;
rdfs:label "Bond" .
:hasISIN a owl:DatatypeProperty ;
rdfs:domain :Bond ;
rdfs:range xsd:string .
# Add constraints
:Bond rdfs:subClassOf [
a owl:Restriction ;
owl:onProperty :hasISIN ;
owl:cardinality 1 # Required, unique
] , [
a owl:Restriction ;
owl:onProperty :hasISIN ;
owl:allValuesFrom [
owl:onDatatype xsd:string ;
owl:withRestrictions (
[ xsd:length 12 ]
[ xsd:pattern "^[A-Z]{2}[A-Z0-9]{9}[0-9]$" ]
)
]
] .
2. Generate LLM-Construct:
ggen construct create .specify/my-bond.ttl
3. Use in your code:
use ggen_ai::constructs::bond_extractor::BondExtractorSignature;
use ggen_ai::dspy::Forward;
use ggen_ai::llm::LLMClient;
let client = LLMClient::from_env()?;
let signature = BondExtractorSignature::new();
let document = "Apple Inc. issued a bond with ISIN US0378331005...";
let result = signature.forward(&client, &[("document", document.into())]).await?;
// Result is guaranteed to satisfy all OWL constraints
// or you get explicit validation errors
Why LLM-Construct?
Traditional Approach:
Manual Prompt β LLM β Unstructured Output β Manual Validation β Hope
LLM-Construct Approach:
OWL Ontology β Auto-Generate SHACL β DSPy Constraints β Guaranteed Valid Output
Benefits:
- Type Safety + Constraints = Guaranteed valid outputs (94% accuracy in tests)
- Single Source of Truth: Domain ontology drives LLM behavior
- Audit Trail: OWL β SHACL β DSPy β code is fully traceable
- 60-80% Faster: Compared to manual prompt engineering
- Zero Prompt Drift: Constraints are formal, not textual
How It Works
Pipeline:
βββββββββββββββ ββββββββββββ βββββββββββββ
β OWL OntologyβββΞΌββββ SHACL βββΞΌββββ DSPy β
β (FIBO) β β Shapes β β Signature β
βββββββββββββββ ββββββββββββ βββββββββββββ
β
β ΞΌβ
βΌ
ββββββββββββββββββββ
β Executable Moduleβ
β + Validation β
ββββββββββββββββββββ
Transformation Rules:
owl:cardinality 1βsh:minCount 1, sh:maxCount 1βrequired: truexsd:length 12βsh:minLength 12, sh:maxLength 12βmin_length: Some(12)xsd:pattern "..."βsh:pattern "..."βpattern: Some("...")xsd:minInclusive 0.0βsh:minInclusive 0.0βmin_value: Some(0.0)owl:oneOf (...)βsh:in (...)βallowed_values: Some(vec![...])
See OWL β SHACL Mapping Reference for all 14+ transformation rules.
Complete Examples
Example 1: FIBO Bond Extractor
# Input: FIBO Bond ontology with constraints
# Output: Type-safe bond extraction module
# Result: 94% accuracy on test dataset
Example 2: FIBO Loan Application Validator
# Input: FIBO Loan ontology + business rules
# Output: Loan application validator with LTV, DTI checks
# Result: Zero invalid applications accepted
Example 3: FIBO Financial Product Classifier
# Input: FIBO Product taxonomy (9 categories)
# Output: Multi-class classifier with enumerations
# Result: 96% classification accuracy
All examples available in .specify/examples/:
fibo-bond-extractor.ttl- Bond data extraction with ISIN validationfibo-loan-validator.ttl- Loan application validation with credit scoringfibo-product-classifier.ttl- Financial product classification with taxonomies
Documentation
- Tutorial: LLM-Construct Step-by-Step Guide
- Reference: OWL β SHACL Transformation Rules
- Implementation: Implementation Roadmap
Integration with ggen
LLM-Construct is part of the ggen v6 ecosystem:
# Install ggen with AI features
cargo install ggen-cli --features ai
# Initialize project
ggen init
# Create LLM-Construct from FIBO ontology
ggen construct create .specify/fibo-bond-extractor.ttl
# Generate code (includes LLM-Construct modules)
ggen sync
# Run tests (validates all constraints)
cargo make test
Requirements:
- ggen v6.0.0+
- Rust 1.91.1+
- LLM provider (OpenAI, Anthropic, or Ollama)
Next Steps:
- Try the 5-minute tutorial
- Explore FIBO examples
- Read OWL β SHACL mapping rules
AI-Powered Generation
ggen-ai brings intelligent code generation to the ggen ecosystem, transforming natural language descriptions into production-ready templates, SPARQL queries, and RDF ontologies. Built on rust-genai for unified multi-provider LLM integration, ggen-ai accelerates development by bridging human intent with semantic specifications.
DSPy-Inspired API
ggen-ai provides a type-safe, composable API inspired by DSPy, enabling structured prompting with compile-time guarantees:
use ggen_ai::dspy::{Signature, InputField, OutputField, Predictor, ChainOfThought};
use serde_json::Value;
use std::collections::HashMap;
// Define a signature (task interface)
let signature = Signature::new(
"GenerateTemplate",
"Generate a Tera template from a description"
)
.with_input(InputField::new("description", "Template description", "String"))
.with_input(InputField::new("language", "Target language", "String"))
.with_output(OutputField::new("template", "Generated template code", "String"));
// Create a predictor
let predictor = Predictor::new(signature)
.with_provider("openai")
.with_temperature(0.7);
// Use ChainOfThought for complex reasoning
let cot = ChainOfThought::new(signature);
// Execute with inputs
let mut inputs = HashMap::new();
inputs.insert("description".to_string(), Value::String("REST API controller".into()));
inputs.insert("language".to_string(), Value::String("Rust".into()));
let outputs = cot.forward(inputs).await?;
Multi-Provider LLM Support
ggen-ai supports 8 major LLM providers through environment-based configuration:
- OpenAI: GPT-4, GPT-4o, GPT-4-turbo
- Anthropic: Claude Opus 4.5, Claude Sonnet 4.5, Claude Haiku 4.5
- Ollama: Local models (Llama, Mistral, Qwen, etc.)
- Google Gemini: Gemini Pro, Gemini Ultra
- DeepSeek: DeepSeek-V3, DeepSeek-Coder
- xAI/Grok: Grok-2, Grok-Beta
- Groq: Ultra-fast inference
- Cohere: Command R+, Command
Production Use Cases
Template Generation: Generate Tera templates from English descriptions
ggen ai generate -d "Database migration template for PostgreSQL" --provider openai
SPARQL Query Generation: Transform intent into semantic queries
ggen ai sparql -d "Find all classes with at least 3 properties" -g schema.ttl
Ontology Creation: Build RDF models from domain descriptions
ggen ai graph -d "Healthcare system: Patient, Doctor, Appointment relationships"
Code Refactoring: AI-assisted code improvement suggestions
ggen ai refactor --code src/main.rs --language rust --focus performance
Quick Start
# Install
cargo install ggen-cli
# Set API key
export OPENAI_API_KEY="sk-..."
# Generate template from natural language
ggen ai generate \
--description "REST API with CRUD operations for User entity" \
--language typescript \
--framework express
# Start MCP server for AI tool integration
ggen ai server --provider anthropic --model claude-sonnet-4-5
Full Documentation: See crates/ggen-ai/README.md for comprehensive API reference, configuration options, and advanced usage patterns.
Documentation
Choose your learning path:
π I want to learn ggen
Start with Tutorials - hands-on, step-by-step projects
π I need to solve a problem
Check How-To Guides - specific solutions to common tasks
π I need reference information
See Reference Docs - CLI, ggen.toml, SPARQL, templates
π‘ I want to understand concepts
Read Explanations - philosophical background and architecture
ποΈ I want working examples
Explore Example Projects - REST APIs, databases, microservices
π Full Documentation Index
See INDEX.md - master listing of all documentation
Core Concepts
1. Ontologies (RDF)
Define your domain model in Turtle syntax - classes, properties, relationships, constraints.
2. SPARQL Queries
Query the ontology to extract data, run inference (CONSTRUCT), and prepare data for generation.
3. Tera Templates
Render code in any language using the Tera template engine with full programming capabilities.
4. Generation Rules
Configure which queries feed into which templates, with validation and transformation rules.
Philosophy
ggen follows three paradigm shifts:
1. Specification-First (Big Bang 80/20)
- β Define specification in RDF (source of truth)
- β Verify specification closure before coding
- β Generate code from complete specification
- β Never: vague requirements β plan β code β iterate
2. Deterministic Validation
- β Same ontology + templates = identical output
- β Reproducible builds, version-able specifications
- β Evidence-based validation (SHACL, ggen validation)
- β Never: subjective code review, narrative validation
3. RDF-First
- β
Edit
.ttlfiles (the source) - β
Generate
.mddocumentation from RDF - β Use ggen to generate ggen documentation
- β Never: edit generated markdown directly
Constitutional Rules (v6)
ggen v6 introduces three non-negotiable paradigms that govern the entire development lifecycle. These aren't suggestionsβthey're architectural constraints that ensure reproducibility, speed, and quality.
1. Big Bang 80/20: Specification Closure First
What it means: Verify that your RDF specification is 100% complete before generating any code. No iteration on generated artifactsβfix the specification and regenerate.
Why it matters:
- 60-80% faster than traditional iterate-and-refactor workflows
- Zero specification drift: Code always reflects current ontology state
- Cryptographic proof: Receipts validate closure before generation begins
How to use it:
# 1. Complete your .specify/*.ttl files
# 2. Validate closure with receipts
ggen validate --closure-proof
# [Receipt] Specification closure: β 127/127 triples, SHA256:a3f2b8c9...
# 3. Only then generate code (single pass)
ggen sync
# [Receipt] Code generation: β 15 files, SHA256:d4e5f6a7..., 2.3s
When to violate: Never. If generated code has bugs, fix the .ttl source and regenerate. Editing generated files breaks determinism.
Connection to v6: Works with Poka-Yoke error-proofing (prevents incomplete specs) and SPARQL validation (ensures semantic correctness).
2. EPIC 9: Parallel Agent Convergence (Advanced)
What it means: For non-trivial tasks, spawn 10 parallel agents that explore the solution space simultaneously, then synthesize the optimal approach through collision detection.
Why it matters:
- 10x exploration bandwidth: Multiple perspectives prevent tunnel vision
- Automatic trade-off analysis: Agents naturally discover edge cases
- Convergence guarantees: Collision detection prevents conflicting changes
How to use it (ggen team internal, optional for users):
# Non-trivial: "Add OAuth2 support with PKCE flow"
ggen epic9 "Add OAuth2 with PKCE, rate limiting, and token refresh"
# Output: 10 agents produce specifications
# [Receipt] Agent 1: OAuth2 core flow, 45 triples
# [Receipt] Agent 2: PKCE extension, 23 triples
# [Receipt] Agent 3: Rate limiting strategy, 31 triples
# ... collision detection runs ...
# [Receipt] Convergence: β Merged 247 triples, 0 conflicts, SHA256:b2c3d4e5...
When to violate: Skip for trivial tasks (single-file changes, documentation updates). Use for:
- Multi-crate changes
- Architectural decisions
- Complex feature additions
- Security-critical implementations
Connection to v6: EPIC 9 agents use Big Bang 80/20 (each agent produces complete spec) and Deterministic Receipts (every agent run is provable).
3. Deterministic Receipts: Evidence Replaces Narrative
What it means: Every operation produces a cryptographic receipt (SHA256 hash + metadata). No "it works on my machine"βidentical inputs yield bit-perfect identical outputs.
Why it matters:
- Reproducible builds: Same ontology + templates = same binary output
- Audit trail: Every generation step is cryptographically provable
- Failure archaeology: Receipts pinpoint exactly what changed between runs
How to use it:
cargo make test
# [Receipt] cargo make test: β 347/347 tests, 0 failures, 28.4s, SHA256:c4d5e6f7...
ggen sync
# [Receipt] SPARQL extraction: β 1,247 triples, 0.8s, SHA256:a1b2c3d4...
# [Receipt] Template rendering: β 23 files, 1.2s, SHA256:e5f6a7b8...
# [Receipt] Final output: β SHA256:f7a8b9c0..., deterministic=true
Receipt format: [Receipt] <operation>: <status> <metrics>, <hash>
Example receipts:
[Receipt] cargo make check: β 0 errors, 3.2s, SHA256:a3b4c5d6...
[Receipt] cargo make lint: β 0 warnings, 12.1s, SHA256:b4c5d6e7...
[Receipt] ggen validate: β 1,543 triples, 100% closure, SHA256:c5d6e7f8...
[Receipt] SHACL validation: β 47 shapes, 0 violations, SHA256:d6e7f8a9...
When to violate: Never in production. For exploratory prototypes, you can skip receipt validation, but regenerate with receipts before committing.
Connection to v6: Receipts integrate with:
- Poka-Yoke: Andon signals (π΄/π‘/π’) appear in receipts
- SPARQL: Query results include hash for reproducibility
- Chicago TDD: Test receipts show exact pass/fail counts
Quality Gates (Pre-Commit)
All three paradigms enforce these gates:
cargo make pre-commit
# [Receipt] cargo make check: β 0 errors, <5s
# [Receipt] cargo make lint: β 0 warnings, <60s
# [Receipt] cargo make test: β 347/347, <30s
# [Receipt] Specification closure: β 100%
# [Receipt] Overall: β All gates passed, SHA256:e7f8a9b0...
Andon Signal Integration:
- π΄ RED (compilation/test error): STOP immediately, fix spec
- π‘ YELLOW (warnings/deprecations): Investigate before release
- π’ GREEN (all checks pass): Safe to proceed
Core Equation: $A = \mu(O)$ β Code (A) precipitates from RDF ontology (O) via transformation pipeline (ΞΌ). Constitutional rules ensure ΞΌ is deterministic, parallel-safe, and provable.
Common Patterns
REST API Generation
# 1. Define API spec in RDF
# 2. SPARQL query to extract endpoints
# 3. Template renders Axum/Rocket code
ggen sync
Multi-Language Support
# Same ontology, different templates
# rust/ β Rust code
# typescript/ β TypeScript code
# python/ β Python code
ggen sync
Database Schema Generation
# RDF model β SPARQL inference β PostgreSQL DDL
# Includes: tables, indexes, relationships, migrations
ggen sync
Status
Version: 5.0.2 Crates: 17 active (ggen-core, ggen-cli, ggen-ai, ggen-marketplace, ggen-test-audit, etc.) Stability: Production-ready License: Apache 2.0 OR MIT
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Development Setup:
git clone https://github.com/seanchatmangpt/ggen
cd ggen
cargo make check # Verify setup
cargo make test # Run tests
cargo make lint # Check style
Resources
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions and discuss ideas
- Security: Responsible disclosure
- Changelog: Version history
Project Constitution
This project follows strict operational principles. See CLAUDE.md for:
- Constitutional rules (cargo make only, RDF-first, Chicago TDD)
- Andon signals (RED = stop, YELLOW = investigate, GREEN = continue)
- Quality gates and validation requirements
- Development philosophy and standards
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Ready to get started? β Quick Start Tutorial
Dependencies
~57β82MB
~1.5M SLoC