[GuideLLM Refactor] mock server package creation#357
Merged
markurtz merged 2 commits intofeatures/refactor/basefrom Sep 29, 2025
Merged
[GuideLLM Refactor] mock server package creation#357markurtz merged 2 commits intofeatures/refactor/basefrom
markurtz merged 2 commits intofeatures/refactor/basefrom
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
Introduces a comprehensive mock server implementation that simulates OpenAI and vLLM APIs with configurable timing characteristics and response patterns. This enables realistic performance testing and validation of GuideLLM benchmarking workflows without requiring actual model deployments.
- Modular architecture with configuration, handlers, models, server, and utilities components
- HTTP request handlers for OpenAI-compatible endpoints with streaming and non-streaming support
- High-performance Sanic-based server with CORS support and proper error handling
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/guidellm/mock_server/init.py | Package initialization exposing main MockServer and MockServerConfig classes |
| src/guidellm/mock_server/config.py | Pydantic-based configuration management with environment variable support |
| src/guidellm/mock_server/handlers/init.py | Handler module initialization exposing request handlers |
| src/guidellm/mock_server/handlers/chat_completions.py | OpenAI chat completions endpoint implementation with streaming support |
| src/guidellm/mock_server/handlers/completions.py | Legacy OpenAI completions endpoint with timing simulation |
| src/guidellm/mock_server/handlers/tokenizer.py | vLLM-compatible tokenization and detokenization endpoints |
| src/guidellm/mock_server/models.py | Pydantic models for request/response validation and API compatibility |
| src/guidellm/mock_server/server.py | Sanic-based HTTP server with middleware, routes, and error handling |
| src/guidellm/mock_server/utils.py | Mock tokenizer and text generation utilities for testing |
| tests/unit/mock_server/init.py | Test package initialization |
| tests/unit/mock_server/test_server.py | Comprehensive integration tests using real HTTP server instances |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
2515465 to
4834767
Compare
841e82c to
b1cce19
Compare
Signed-off-by: Mark Kurtz <mark.j.kurtz@gmail.com>
ca2be85 to
bb98193
Compare
sjmonson
approved these changes
Sep 23, 2025
sjmonson
added a commit
that referenced
this pull request
Sep 23, 2025
…nto features/refactor/base-draft [GuideLLM Refactor] mock server package creation #357
Base automatically changed from
features/refactor/benchmarker
to
features/refactor/base
September 29, 2025 14:19
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a comprehensive mock server implementation that simulates OpenAI and vLLM APIs with configurable timing characteristics and response patterns. The mock server enables realistic performance testing and validation of GuideLLM benchmarking workflows without requiring actual model deployments, supporting both streaming and non-streaming endpoints with proper token counting, latency simulation (TTFT/ITL), and error handling.
Details
mock_serverpackage with modular architecture including configuration, handlers, models, server, and utilitiesMockServerConfigwith Pydantic settings for centralized configuration management supporting environment variablesChatCompletionsHandlerfor/v1/chat/completionswith streaming supportCompletionsHandlerfor/v1/completionslegacy endpointTokenizerHandlerfor vLLM-compatible/tokenizeand/detokenizeendpointsTest Plan
Related Issues
Use of AI
## WRITTEN BY AI ##)