Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Bug] correct out dtype of rms_norm_gated native path bug Something isn't working
#35369 opened Feb 26, 2026 by zufangzhu Loading…
[Bugfix] Fix Qwen2.5-Omni and Qwen3-Omni mixed-modality embed regression bug Something isn't working multi-modality Related to multi-modality (#4194) qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed
#35368 opened Feb 26, 2026 by linyueqian Loading…
[WIP] [Feature] Add Qwen3-ForcedAligner support via token classification pooling documentation Improvements or additions to documentation new-model Requests to new models qwen Related to Qwen models
#35367 opened Feb 26, 2026 by haosdent Draft
3 tasks
[WIP] [Bugfix] Fix MXFP4 weight_loader crash on per-expert 2-D weights bug Something isn't working
#35366 opened Feb 26, 2026 by haosdent Loading…
[Bugfix] Fix SymmMemCommunicator disabled on SM 10.3/12.0 GPUs bug Something isn't working
#35360 opened Feb 26, 2026 by haosdent Loading…
[Bugfix][Frontend] Fix reasoning-end detection to check prompt tail o… bug Something isn't working frontend
#35358 opened Feb 26, 2026 by Julien-ser Loading…
5 tasks done
[DO NOT MERGE][Bugfix] Fix CPU memory leak in multimodal IPC sends bug Something isn't working v1
#35357 opened Feb 26, 2026 by ywang96 Draft
5 tasks
[Bugfix] Use is_integrated to detect UMA GPUs for memory reporting bug Something isn't working nvidia
#35356 opened Feb 26, 2026 by haosdent Loading…
add acceptance_threshold argument v1
#35355 opened Feb 26, 2026 by youkaichao Draft
5 tasks
[Bugfix] Remove erroneous lower bound on LoRA vocab size constraint bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed
#35354 opened Feb 26, 2026 by LucasWilkinson Loading…
1 of 2 tasks
[Bug] Fix missing <think> tag after tool call in MiniMax 2.1 bug Something isn't working
#35352 opened Feb 26, 2026 by stingoChen Loading…
3 of 5 tasks
[XPU] special handle for pooler models w8a16 gemm
#35351 opened Feb 26, 2026 by yma11 Loading…
5 tasks
Fix Qwen 3.5 tool calling problem qwen Related to Qwen models
#35347 opened Feb 26, 2026 by sunqingn7 Loading…
5 tasks
Cpu dispatcher ci/build cpu Related to CPU backends documentation Improvements or additions to documentation nvidia performance Performance-related issues rocm Related to AMD ROCm v1
#35346 opened Feb 26, 2026 by majian4work Loading…
Fix MoE models in EP mode on Ascend
#35345 opened Feb 26, 2026 by ylyhyqsl Loading…
[Bug] FA2 is not supported for NVIDIA Blackwell architecture bug Something isn't working nvidia v1
#35341 opened Feb 25, 2026 by olka Loading…
2 of 5 tasks
[ROCm] Enabling encoder and encoder-decoder on ROCm and AITER unified backends documentation Improvements or additions to documentation rocm Related to AMD ROCm v1
#35334 opened Feb 25, 2026 by gshtras Loading…
[Perf] Optimize model runner v2 prepare_inputs copy logic, 6.1% E2E throughput improvement ready ONLY add when PR is ready to merge/full CI is needed v1 v2
#35333 opened Feb 25, 2026 by yewentao256 Loading…
[Perf] Optimize maxsim scores computation for pooling models, 13.9% E2E throughput improvement frontend ready ONLY add when PR is ready to merge/full CI is needed
#35330 opened Feb 25, 2026 by yewentao256 Loading…
ProTip! Updated in the last three days: updated:>2026-02-22.