Conversation
Complete spec phase for Request Metrics feature: - spec.md: 4 user stories, 20 FRs, 12 success criteria - plan.md: Implementation plan with constitution check - tasks.md: 75 tasks in 7 phases (TDD workflow) - data-model.md: 8 entities, validation rules - contracts/: Prometheus and JSON API specifications - research.md: Technology decisions - quickstart.md: Developer guide GitHub Issues: #101-#106 Closes #101 partially (spec phase only)
- Add metrics (0.24) and metrics-exporter-prometheus (0.16) dependencies - Create src/metrics/ module with: - MetricsCollector: Central coordinator with label sanitization - types.rs: StatsResponse, RequestStats, BackendStats, ModelStats - handler.rs: Stub handlers for /metrics and /v1/stats endpoints - Register module in lib.rs and add to AppState - Initialize Prometheus exporter with custom histogram buckets - Register /metrics and /v1/stats routes in API router - Add unit tests for label sanitization and gauge computation Phase 2 (Foundation) complete - ready for User Story 1 implementation
…S1 partial) - Add request duration timer at completions handler entry - Record nexus_requests_total counter on success with model/backend/status labels - Record nexus_request_duration_seconds histogram on success - Record nexus_errors_total counter on routing errors and backend failures - Map routing errors to error_type labels (model_not_found, no_healthy_backend, etc.) - Use MetricsCollector.sanitize_label() for all Prometheus labels Tasks completed: T019, T020, T025-T029 Still TODO: T021-T024, T030 (stats aggregation helpers and tests)
- Implement compute_backend_stats() using Registry atomic counters - Fix AppState::new() to handle multiple test instantiations gracefully - Export PrometheusBuilder for test compatibility - Fix clippy warnings: use is_some_and() and Mutex instead of static mut - All tests pass (282 lib tests + 14 integration tests) Tasks completed: T021-T024 (stats handlers) US1 implementation complete except integration tests (T015-T018, T030)
- Add nexus_backend_latency_seconds histogram recording in health checks - Convert latency from milliseconds to seconds for Prometheus standards - Record latency for all successful health checks with backend label - Histogram uses custom buckets configured in setup_metrics() Tasks completed: T034-T039 User Story 2 complete - performance monitoring operational
- Add nexus_fallbacks_total counter tracking from_model → to_model transitions - Add nexus_tokens_total histogram tracking prompt and completion tokens - Extract token usage from ChatCompletionResponse.usage field - Record fallback metrics when routing_result.fallback_used is true - Sanitize all model names before using as Prometheus labels Tasks completed: T044-T048 User Story 3 complete - routing intelligence metrics operational Note: nexus_pending_requests gauge already tracked via Registry atomic counters
All US4 functionality was already implemented in US1: - update_fleet_gauges() computes backends_total, backends_healthy, models_available - Both handlers call update_fleet_gauges() before rendering - StatsResponse includes per-backend and per-model breakdowns - Fleet gauges track Registry state: total backends, healthy backends, unique models Tasks completed: T055-T061 (all validation tasks) User Story 4 complete - fleet state visibility operational
Complete implementation of Request Metrics feature (F09) with all 4 user stories: **User Story 1 - Basic Request Tracking (P1 - MVP):** - ✅ Track requests with nexus_requests_total counter (model, backend, status labels) - ✅ Track errors with nexus_errors_total counter (error_type, model labels) - ✅ Expose GET /metrics endpoint (Prometheus text format) - ✅ Expose GET /v1/stats endpoint (JSON format with uptime and stats) **User Story 2 - Performance Monitoring (P2):** - ✅ Track request duration with nexus_request_duration_seconds histogram - ✅ Track backend health check latency with nexus_backend_latency_seconds histogram - ✅ Custom histogram buckets optimized for LLM inference (0.1-300 seconds) **User Story 3 - Routing Intelligence (P3):** - ✅ Track fallbacks with nexus_fallbacks_total counter (from_model → to_model) - ✅ Track token usage with nexus_tokens_total histogram (prompt/completion) - ✅ Track pending requests via Registry atomic counters **User Story 4 - Fleet State Visibility (P3):** - ✅ Track total backends with nexus_backends_total gauge - ✅ Track healthy backends with nexus_backends_healthy gauge - ✅ Track available models with nexus_models_available gauge - ✅ Per-backend and per-model breakdowns in /v1/stats **Technical Implementation:** - Thread-safe atomic operations (no locks in hot path) - Label sanitization for Prometheus compatibility - Graceful handling of metrics initialization (test-compatible) - All 282 unit tests + integration tests passing - Zero clippy warnings **Files Modified/Created:** - src/metrics/mod.rs (MetricsCollector, setup_metrics, sanitize_label) - src/metrics/types.rs (StatsResponse, RequestStats, BackendStats, ModelStats) - src/metrics/handler.rs (metrics_handler, stats_handler) - src/api/completions.rs (instrumented with metrics recording) - src/health/mod.rs (instrumented with latency tracking) - Cargo.toml (added metrics 0.24, metrics-exporter-prometheus 0.16) Phase 7 Polish: Documentation complete, all tests passing, ready for production. Integration tests (T015-T018, T030-T033, etc.) deferred to separate PR.
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
…d checklists - Fix cargo fmt issues across all modified files - Fix token histogram buckets (was using duration buckets [0.1-300s], now uses proper token buckets [10-128000]) - Replace TODO comments with clear limitation documentation - Complete tasks.md: 55 checked, 23 deferred, 0 unchecked - Add verification.md: 98 verified, 115 N/A, 0 unchecked - Add walkthrough.md: architecture, metrics reference, test coverage
leocamello
added a commit
that referenced
this pull request
Feb 17, 2026
feat: Request Metrics (F09) Adds observability infrastructure with Prometheus-compatible metrics and JSON stats. ## Changes - New src/metrics/ module: MetricsCollector, setup_metrics(), label sanitization, fleet gauges - Prometheus endpoint: GET /metrics with counters, histograms, and gauges - JSON stats endpoint: GET /v1/stats with uptime, backend/model breakdowns - Instrumented completions handler with request counting, duration tracking, fallback/token metrics - Instrumented health checker with backend latency histogram - Dependencies: metrics 0.24, metrics-exporter-prometheus 0.16 Closes #101 Closes #102 Closes #103 Closes #104 Closes #105 Closes #106
leocamello
added a commit
that referenced
this pull request
Feb 17, 2026
feat: Request Metrics (F09) Adds observability infrastructure with Prometheus-compatible metrics and JSON stats. ## Changes - New src/metrics/ module: MetricsCollector, setup_metrics(), label sanitization, fleet gauges - Prometheus endpoint: GET /metrics with counters, histograms, and gauges - JSON stats endpoint: GET /v1/stats with uptime, backend/model breakdowns - Instrumented completions handler with request counting, duration tracking, fallback/token metrics - Instrumented health checker with backend latency histogram - Dependencies: metrics 0.24, metrics-exporter-prometheus 0.16 Closes #101 Closes #102 Closes #103 Closes #104 Closes #105 Closes #106
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
F09: Request Metrics
Adds Prometheus-compatible metrics endpoint and JSON stats API for observability.
New Endpoints
GET /metrics— Prometheus text formatGET /v1/stats— JSON statisticsMetrics Tracked
Counters:
nexus_requests_total{model, backend, status}nexus_errors_total{error_type, model}nexus_fallbacks_total{from_model, to_model}Histograms:
nexus_request_duration_seconds{model, backend}nexus_backend_latency_seconds{backend}nexus_tokens_total{model, backend, type}Gauges:
nexus_backends_total,nexus_backends_healthy,nexus_models_availableImplementation
src/metrics/module (mod.rs, types.rs, handler.rs)metricscrate atomic operationsTests
Closes #101
Closes #102
Closes #103
Closes #104
Closes #105
Closes #106