Open
Conversation
0d2a434 to
0698383
Compare
6d7b4d0 to
2719655
Compare
When running multiple rates (constant, poisson, concurrent profiles) or sweeping, stop escalating to higher rates if a failure constraint (over-saturation, max errors, error rate) triggers at a lower rate. - Sort rates/streams ascending in AsyncProfile and ConcurrentProfile - Add _should_stop_escalating() on base Profile class using stop_all as the failure signal (vs stop_local for normal completions) - Skip failure check after throughput phase in SweepProfile since over-saturation is expected at maximum load - Log warning when rate order is changed by sorting - Update CLI help and README with multi-rate documentation - Add comprehensive unit tests for all profile types Signed-off-by: Uri Shaket <ushaket@redhat.com>
Signed-off-by: Uri Shaket <ushaket@redhat.com>
2719655 to
4ef5ffb
Compare
Signed-off-by: Uri Shaket <ushaket@redhat.com>
Signed-off-by: Uri Shaket <ushaket@redhat.com>
This reverts commit 2538f49. Signed-off-by: Uri Shaket <ushaket@redhat.com>
ed54a85 to
6ffc31d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds cross-rate early-exit behavior for multi-rate benchmark profiles so benchmarks stop escalating once a terminal failure condition is hit at a lower rate/stream. It also makes rate/stream ordering deterministic (ascending) and updates user-facing docs/help text to reflect multi-rate semantics and skip behavior.
Details
request_processing=stop_all) from the previous benchmark state.AsyncProfile.next_strategy()to stop scheduling higher rates after a terminal failure; continue normally onstop_local.ConcurrentProfile.next_strategy()with the same early-exit behavior for stream escalation.SweepProfile.next_strategy()sosynchronousandthroughputalways run, with early-exit applied only during async-rate continuation.--ratesemantics, multi-value behavior, and failure-triggered skipping.stop_local)stop_all)Test Plan
pytest tests/unit/benchmark/test_profiles.pypytest tests/unit/benchmark -k profileguidellm benchmark ... --profile constant --rate 1 --rate 5 --rate 10Related Issues
Use of AI
## WRITTEN BY AI ##)