-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
- #42770 · WoosukKwon opened
on May 15, 2026 20 - #39749 · simon-mo opened
on Apr 13, 2026 6 - #44280 · BugenZhao opened
on Jun 2, 2026 11
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
[Bug]: CPUOffloading + enable_cross_layers_blocks + gptoss-120b
bugSomething isn't workingSomething isn't workingStatus: Open.#47054 In vllm-project/vllm;[Bug]: GPT-OSS models silently ignore unsupported content types (input_file, etc.)
bugSomething isn't workingSomething isn't workingStatus: Open.#47051 In vllm-project/vllm;- Status: Open.#47047 In vllm-project/vllm;
[Bug][ROCm] GLM-5.2-FP8 sparse MLA decode degenerates at long context on gfx942 (MI325X)
rocmRelated to AMD ROCmRelated to AMD ROCmStatus: Open.#47042 In vllm-project/vllm;[Bug]: vLLM 0.23.0: FlashInfer / Triton attention + FP8 KV cache doesn't work on H200 (sm_90)
bugSomething isn't workingSomething isn't workingStatus: Open.#47037 In vllm-project/vllm;[Bug]: Gemma4ForConditionalGeneration + runtime LoRA greedy decode diverges from HuggingFace transformers on text-only prompts
bugSomething isn't workingSomething isn't workingStatus: Open.#47026 In vllm-project/vllm;[Bug]: Crash on using JSON structured output with speculative decoding on guidance backend
bugSomething isn't workingSomething isn't workingStatus: Open.#47025 In vllm-project/vllm;- Status: Open.#47022 In vllm-project/vllm;
- Status: Open.#47020 In vllm-project/vllm;
[Performance]: [Fused MoE] BLOCK_SIZE_K=128 decode default causes large slowdown on Volta/V100
performancePerformance-related issuesPerformance-related issuesStatus: Open.#47019 In vllm-project/vllm;[CPU][Bug] Dynamic Speculative Decoding crashes on CPU backend
cpuRelated to CPU backendsRelated to CPU backendsStatus: Open.#47014 In vllm-project/vllm;[Bug]: AssertionError: Overwriting existing tensor attribute: weight_loader when serving FP8 model on 2x RTX 5090 (Blackwell)
bugSomething isn't workingSomething isn't workingStatus: Open.#47005 In vllm-project/vllm;