Fix worker status for unusable terminal backend responses#615
Fix worker status for unusable terminal backend responses#615ushaket wants to merge 6 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Uri Shaket <ushaket@redhat.com>
|
This introduce worker dependency on GenerationResponse, not sure if that's the right way to go, |
|
Hmm, yeah I am not a fan of depending on |
Signed-off-by: Uri Shaket <ushaket@redhat.com>
Signed-off-by: Uri Shaket <ushaket@redhat.com>
|
Done |
sjmonson
left a comment
There was a problem hiding this comment.
Can you rebase and get rid of the merge commits?
Summary
This PR fixes a scheduler correctness bug where requests could be marked as
completedeven when the backend resolved without a usable terminal response. It adds explicit terminal-response validation in the worker so malformed/empty terminal results are surfaced aserroredwith a clear diagnostic instead of being counted as successful requests.Details
WorkerProcessto guard final status transitions.errored[UNUSABLE_BACKEND_RESPONSE] backend resolved without a usable terminal response payloadGenerationResponse-aware usability criteria:textoroutput_metrics.total_tokens > 0Noneterminal response is always unusableGenerationResponsefallback remainsbool(response)for generic/test compatibilitytests/unit/scheduler/test_worker.pyfor:erroredGenerationResponse->erroredGenerationResponsewith empty text ->completedTest Plan
uv run pytest -q tests/unit/scheduler/test_worker.py -k "terminal_response or empty_generation_response or generation_response_with_tokens or invalid_initialization"completedGenerationResponseis not markedcompletedoutput_tokens > 0) is accepted ascompletedRelated Issues
Use of AI
## WRITTEN BY AI ##)