-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix compressed-tensors models failing to load with transformers backend
bug
Something isn't working
quantization
ready
ONLY add when PR is ready to merge/full CI is needed
#30287
opened Dec 9, 2025 by
mgoin
Loading…
5 tasks
Ensure minimum frames for GLM 4.6V compatibility
#30285
opened Dec 9, 2025 by
gh-wf
Loading…
1 of 3 tasks
[BugFix] Lazy tokenizer init in StructuredOutputManager to prevent GGUF semaphore leak
structured-output
v1
#30284
opened Dec 9, 2025 by
kitaekatt
Loading…
4 tasks
[Small] Add comment for
parallel_config in FusedMoEModularKernel
#30282
opened Dec 8, 2025 by
yewentao256
•
Draft
[CI/Build] Ignore data_parallel_size_local
#30281
opened Dec 8, 2025 by
rjrock
Loading…
3 of 5 tasks
[BugFix] Fix non detected failing tests
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#30277
opened Dec 8, 2025 by
ilmarkov
Loading…
5 tasks
[ROCM][CI] Fix AMD Examples Test Group
ci/build
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
#30276
opened Dec 8, 2025 by
Concurrensee
Loading…
[NIXL] refine decoder side post process for heterogeneous BlockSize and kv_layout
kv-connector
v1
#30275
opened Dec 8, 2025 by
xuechendi
Loading…
5 tasks
[AMD] Amd/deepseek aiter fusions
deepseek
Related to DeepSeek models
needs-rebase
rocm
Related to AMD ROCm
v1
[Bugfix] Temporarily disable group quant rms norm fusion
#30273
opened Dec 8, 2025 by
ElizaWszola
Loading…
[CI/Build] Use spawn subprocess for ROCm
documentation
Improvements or additions to documentation
rocm
Related to AMD ROCm
#30272
opened Dec 8, 2025 by
rjrock
Loading…
3 of 5 tasks
[ROCm][CI][Bugfix] Multi-Modal Model Support Fixes and Attention Backend Improvements
ci/build
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
rocm
Related to AMD ROCm
#30270
opened Dec 8, 2025 by
AndreasKaratzas
Loading…
[Frontend] Fixes anthropic streaming message_start usage nesting
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#30266
opened Dec 8, 2025 by
bbartels
Loading…
5 tasks
Multiple Hybrid KV Cache Coordinator
v1
#30263
opened Dec 8, 2025 by
roikoren755
Loading…
3 of 5 tasks
Support TP which is not divded for NVFP4 kernels (flashinfer-cutlass) by adding dynamic padding
nvidia
#30260
opened Dec 8, 2025 by
danielafrimi
Loading…
[Feature]: OpenTelemetry Metrics Support
v1
#30258
opened Dec 8, 2025 by
mladjan-gadzic
•
Draft
3 of 5 tasks
[bugfix][quantization] Fix fp8 per_tensor scale shape
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
v1
#30257
opened Dec 8, 2025 by
haoyangli-amd
Loading…
[ROCm] Use aiter.topk_sigmoid in llama4
llama
Related to Llama models
rocm
Related to AMD ROCm
#30255
opened Dec 8, 2025 by
tpopp
Loading…
gptq marlin quantization support for fused moe with lora
#30254
opened Dec 8, 2025 by
Bhanu068
Loading…
3 of 5 tasks
fix: DeepSeek-V3.2 DeepGEMM RuntimeError
deepseek
Related to DeepSeek models
#30251
opened Dec 8, 2025 by
KeeProMise
Loading…
5 tasks
[gpt-oss] Add model_identity to system message retrieval for harmony chat template
frontend
gpt-oss
Related to GPT-OSS models
#30247
opened Dec 8, 2025 by
lyuwen
Loading…
5 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.