Skip to content

feat: Request Metrics (F09)#107

Merged
leocamello merged 9 commits intomainfrom
009-request-metrics
Feb 12, 2026
Merged

feat: Request Metrics (F09)#107
leocamello merged 9 commits intomainfrom
009-request-metrics

Conversation

@leocamello
Copy link
Owner

F09: Request Metrics

Adds Prometheus-compatible metrics endpoint and JSON stats API for observability.

New Endpoints

  • GET /metrics — Prometheus text format
  • GET /v1/stats — JSON statistics

Metrics Tracked

Counters:

  • nexus_requests_total{model, backend, status}
  • nexus_errors_total{error_type, model}
  • nexus_fallbacks_total{from_model, to_model}

Histograms:

  • nexus_request_duration_seconds{model, backend}
  • nexus_backend_latency_seconds{backend}
  • nexus_tokens_total{model, backend, type}

Gauges:

  • nexus_backends_total, nexus_backends_healthy, nexus_models_available

Implementation

  • New src/metrics/ module (mod.rs, types.rs, handler.rs)
  • Instrumented completions handler + health checker
  • Thread-safe via metrics crate atomic operations
  • Custom histogram buckets for LLM inference [0.1s - 300s]
  • Label sanitization with caching

Tests

  • 365 tests passing (282 unit + 83 integration/doc)
  • Zero clippy warnings

Closes #101
Closes #102
Closes #103
Closes #104
Closes #105
Closes #106

Complete spec phase for Request Metrics feature:
- spec.md: 4 user stories, 20 FRs, 12 success criteria
- plan.md: Implementation plan with constitution check
- tasks.md: 75 tasks in 7 phases (TDD workflow)
- data-model.md: 8 entities, validation rules
- contracts/: Prometheus and JSON API specifications
- research.md: Technology decisions
- quickstart.md: Developer guide

GitHub Issues: #101-#106

Closes #101 partially (spec phase only)
- Add metrics (0.24) and metrics-exporter-prometheus (0.16) dependencies
- Create src/metrics/ module with:
  - MetricsCollector: Central coordinator with label sanitization
  - types.rs: StatsResponse, RequestStats, BackendStats, ModelStats
  - handler.rs: Stub handlers for /metrics and /v1/stats endpoints
- Register module in lib.rs and add to AppState
- Initialize Prometheus exporter with custom histogram buckets
- Register /metrics and /v1/stats routes in API router
- Add unit tests for label sanitization and gauge computation

Phase 2 (Foundation) complete - ready for User Story 1 implementation
…S1 partial)

- Add request duration timer at completions handler entry
- Record nexus_requests_total counter on success with model/backend/status labels
- Record nexus_request_duration_seconds histogram on success
- Record nexus_errors_total counter on routing errors and backend failures
- Map routing errors to error_type labels (model_not_found, no_healthy_backend, etc.)
- Use MetricsCollector.sanitize_label() for all Prometheus labels

Tasks completed: T019, T020, T025-T029
Still TODO: T021-T024, T030 (stats aggregation helpers and tests)
- Implement compute_backend_stats() using Registry atomic counters
- Fix AppState::new() to handle multiple test instantiations gracefully
- Export PrometheusBuilder for test compatibility
- Fix clippy warnings: use is_some_and() and Mutex instead of static mut
- All tests pass (282 lib tests + 14 integration tests)

Tasks completed: T021-T024 (stats handlers)
US1 implementation complete except integration tests (T015-T018, T030)
- Add nexus_backend_latency_seconds histogram recording in health checks
- Convert latency from milliseconds to seconds for Prometheus standards
- Record latency for all successful health checks with backend label
- Histogram uses custom buckets configured in setup_metrics()

Tasks completed: T034-T039
User Story 2 complete - performance monitoring operational
- Add nexus_fallbacks_total counter tracking from_model → to_model transitions
- Add nexus_tokens_total histogram tracking prompt and completion tokens
- Extract token usage from ChatCompletionResponse.usage field
- Record fallback metrics when routing_result.fallback_used is true
- Sanitize all model names before using as Prometheus labels

Tasks completed: T044-T048
User Story 3 complete - routing intelligence metrics operational
Note: nexus_pending_requests gauge already tracked via Registry atomic counters
All US4 functionality was already implemented in US1:
- update_fleet_gauges() computes backends_total, backends_healthy, models_available
- Both handlers call update_fleet_gauges() before rendering
- StatsResponse includes per-backend and per-model breakdowns
- Fleet gauges track Registry state: total backends, healthy backends, unique models

Tasks completed: T055-T061 (all validation tasks)
User Story 4 complete - fleet state visibility operational
Complete implementation of Request Metrics feature (F09) with all 4 user stories:

**User Story 1 - Basic Request Tracking (P1 - MVP):**
- ✅ Track requests with nexus_requests_total counter (model, backend, status labels)
- ✅ Track errors with nexus_errors_total counter (error_type, model labels)
- ✅ Expose GET /metrics endpoint (Prometheus text format)
- ✅ Expose GET /v1/stats endpoint (JSON format with uptime and stats)

**User Story 2 - Performance Monitoring (P2):**
- ✅ Track request duration with nexus_request_duration_seconds histogram
- ✅ Track backend health check latency with nexus_backend_latency_seconds histogram
- ✅ Custom histogram buckets optimized for LLM inference (0.1-300 seconds)

**User Story 3 - Routing Intelligence (P3):**
- ✅ Track fallbacks with nexus_fallbacks_total counter (from_model → to_model)
- ✅ Track token usage with nexus_tokens_total histogram (prompt/completion)
- ✅ Track pending requests via Registry atomic counters

**User Story 4 - Fleet State Visibility (P3):**
- ✅ Track total backends with nexus_backends_total gauge
- ✅ Track healthy backends with nexus_backends_healthy gauge
- ✅ Track available models with nexus_models_available gauge
- ✅ Per-backend and per-model breakdowns in /v1/stats

**Technical Implementation:**
- Thread-safe atomic operations (no locks in hot path)
- Label sanitization for Prometheus compatibility
- Graceful handling of metrics initialization (test-compatible)
- All 282 unit tests + integration tests passing
- Zero clippy warnings

**Files Modified/Created:**
- src/metrics/mod.rs (MetricsCollector, setup_metrics, sanitize_label)
- src/metrics/types.rs (StatsResponse, RequestStats, BackendStats, ModelStats)
- src/metrics/handler.rs (metrics_handler, stats_handler)
- src/api/completions.rs (instrumented with metrics recording)
- src/health/mod.rs (instrumented with latency tracking)
- Cargo.toml (added metrics 0.24, metrics-exporter-prometheus 0.16)

Phase 7 Polish: Documentation complete, all tests passing, ready for production.
Integration tests (T015-T018, T030-T033, etc.) deferred to separate PR.
@leocamello leocamello added enhancement New feature or request v0.2 v0.2 Observability milestone request-metrics F09: Request Metrics feature labels Feb 12, 2026
@codecov-commenter
Copy link

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

…d checklists

- Fix cargo fmt issues across all modified files
- Fix token histogram buckets (was using duration buckets [0.1-300s],
  now uses proper token buckets [10-128000])
- Replace TODO comments with clear limitation documentation
- Complete tasks.md: 55 checked, 23 deferred, 0 unchecked
- Add verification.md: 98 verified, 115 N/A, 0 unchecked
- Add walkthrough.md: architecture, metrics reference, test coverage
@leocamello leocamello merged commit 3dd5070 into main Feb 12, 2026
9 checks passed
@leocamello leocamello deleted the 009-request-metrics branch February 14, 2026 17:01
leocamello added a commit that referenced this pull request Feb 17, 2026
feat: Request Metrics (F09)

Adds observability infrastructure with Prometheus-compatible metrics and JSON stats.

## Changes
- New src/metrics/ module: MetricsCollector, setup_metrics(), label sanitization, fleet gauges
- Prometheus endpoint: GET /metrics with counters, histograms, and gauges
- JSON stats endpoint: GET /v1/stats with uptime, backend/model breakdowns
- Instrumented completions handler with request counting, duration tracking, fallback/token metrics
- Instrumented health checker with backend latency histogram
- Dependencies: metrics 0.24, metrics-exporter-prometheus 0.16

Closes #101
Closes #102
Closes #103
Closes #104
Closes #105
Closes #106
leocamello added a commit that referenced this pull request Feb 17, 2026
feat: Request Metrics (F09)

Adds observability infrastructure with Prometheus-compatible metrics and JSON stats.

## Changes
- New src/metrics/ module: MetricsCollector, setup_metrics(), label sanitization, fleet gauges
- Prometheus endpoint: GET /metrics with counters, histograms, and gauges
- JSON stats endpoint: GET /v1/stats with uptime, backend/model breakdowns
- Instrumented completions handler with request counting, duration tracking, fallback/token metrics
- Instrumented health checker with backend latency histogram
- Dependencies: metrics 0.24, metrics-exporter-prometheus 0.16

Closes #101
Closes #102
Closes #103
Closes #104
Closes #105
Closes #106
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request request-metrics F09: Request Metrics feature v0.2 v0.2 Observability milestone

Projects

None yet

2 participants