Woosuk Kwon
7fb2a5be28
[V0 Deprecation] Skip PP test ( #25128 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 20:18:36 -07:00
Woosuk Kwon
6c036615dc
[V0 Deprecation] Remove misc V0 tests ( #25118 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 19:41:55 -07:00
Woosuk Kwon
2fc24e94f9
[V0 Deprecation] Remove V0 Tracing & Metrics tests ( #25115 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 19:40:44 -07:00
Woosuk Kwon
2c3c1bd07a
[V0 Deprecation] Remove V0 Engine tests ( #25114 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 19:38:09 -07:00
bnellnm
5963b98b46
[Kernel] Delegate construction of FusedMoEQuantConfig to FusedMoEMethodBase subclasses ( #22537 )
...
Signed-off-by: Bill Nell <bnell@redhat.com>
2025-09-17 17:43:31 -06:00
elvischenv
e6585ddb45
[Bugfix] Fix accuracy issue for silu_mul + nvfp4 quant fusion kernel ( #24833 )
...
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-09-17 16:37:23 -07:00
afeldman-nm
7ae9887542
[V1] Logits processor docs ( #22919 )
...
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Joseph Marinier <Joseph.Marinier@gmail.com>
2025-09-17 11:53:12 -07:00
Michael Goin
e3db5ebb66
[CI Bugfix] Fix failing test_model_load_with_params tests due to tokenizer refactor ( #25086 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-17 11:15:05 -07:00
Woosuk Kwon
9d442b7c48
[V0 Deprecation] Remove V0 tests in test_sequence.py ( #25088 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 11:08:45 -07:00
Woosuk Kwon
eb68c2dcd9
[CI] Revert back prepare_prompts and check_answers ( #25087 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 11:03:16 -07:00
Woosuk Kwon
4b946d693e
[V0 Deprecation] Remove V0 Core tests ( #25082 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-17 09:32:42 -07:00
Michael Goin
087c6ffc92
[CI Bugfix] Fix failing test_invalid_env ( #25078 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-17 08:28:58 -07:00
danielafrimi
252ada5559
Add RADIO Vision Encoder Support to vLLM ( #24595 )
...
Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>
Co-authored-by: root <root@cw-dfw-h100-001-305-026.cm.cluster>
2025-09-17 05:53:30 -07:00
Cyrus Leung
e120533d7a
[Misc] Avoid use of deprecated AutoModelForVision2Seq ( #25065 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-17 12:19:15 +00:00
Chauncey
544fe76b95
[Frontend] Support returning all prompt logprobs ( #24956 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-09-17 09:03:52 +00:00
Zhuohan Li
6c47f6bfa4
[Core] Remove tokenizer group in vLLM ( #24078 )
...
Signed-off-by: Zhuohan Li <zhuohan123@gmail.com>
2025-09-17 08:42:59 +00:00
haoyangli-amd
ca2d1925ef
[Rocm] [quantization] Fix quark ptpc moe and add test case ( #24649 )
...
Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
Co-authored-by: Haoyang Li <haoyang.li@amd.com>
2025-09-16 22:15:13 -07:00
Roger Wang
0f7acdd73c
[Model] Support Qwen3-VL Model Series ( #24727 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Huang Jie <92386084+JJJYmmm@users.noreply.github.com>
Co-authored-by: 松灵 <26085463+wulipc@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-17 05:01:04 +00:00
Woosuk Kwon
5801e49776
[V0 Deprecation] Remove MQLLMEngine ( #25019 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-16 21:29:27 -07:00
Nick Hill
eeb135eb87
[Core] Use CpuGpuBuffer for block table tensors ( #24795 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-09-16 19:18:06 -07:00
Tahsin Tunan
cef32104b4
[FP8] Extend per-token-group quantization support to QuantFP8 ( #24342 )
...
Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Luka Govedič <lgovedic@redhat.com>
2025-09-16 18:31:06 -07:00
Michael Goin
493b10f8bf
[CI] GPT-OSS GPQA eval test for Blackwell ( #24920 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-16 18:13:21 -07:00
Andrew Sansom
02d4b85454
Use kwargs for long lists of EngineCoreRequest arguments in tests and fix extra kwargs ( #24987 )
...
Signed-off-by: Andrew Sansom <andrew@protopia.ai>
2025-09-16 14:06:56 -07:00
Andrew Xia
86daa875fe
[gpt-oss][1][bugfix] fix streaming final output ( #24466 )
...
Signed-off-by: Andrew Xia <axia@meta.com>
2025-09-16 13:56:16 -06:00
Andrew Xia
f4d6eb95cf
[gpt-oss][1b] streaming add item id, content id ( #24788 )
...
Signed-off-by: Andrew Xia <axia@meta.com>
2025-09-16 18:41:12 +00:00
Sage Moore
567939953b
[Core/DBO][1/N] Add Dual-Batch Overlap mechanism to VLLM ( #23693 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
2025-09-16 12:21:48 -04:00
Ming Yang
4e5affeaa1
[CI] Add Decode Context Parallelism (DCP) test to CI ( #24487 )
...
Signed-off-by: Ming Yang <minos.future@gmail.com>
2025-09-16 21:21:28 +08:00
Chen Bruce
7ea5c73ad7
[Feat][EPLB] A novel static EPLB placement strategy for MoE models. ( #23745 )
...
Signed-off-by: bruceszchen <bruceszchen@tencent.com>
Signed-off-by: Chen Bruce <bruceszchen@tencent.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Chen Bruce <cszwwdz@vip.qq.com>
Co-authored-by: lemon412 <lemon412@foxmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-16 10:55:16 +00:00
cascade
17871983a2
[Bugfix] Fix sequence parallelism bug when enable pipeline parallelism ( #24021 )
...
Signed-off-by: cascade812 <cascade812@outlook.com>
2025-09-16 04:32:32 +00:00
Woosuk Kwon
759ef49b15
Remove V0 Encoder-Decoder Support ( #24907 )
...
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-15 21:17:14 -07:00
Mark McLoughlin
2942970d44
[Metrics] Hide deprecated metrics with gpu_ prefix ( #24245 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
2025-09-15 20:15:57 -06:00
Wentao Ye
3c96e7b8a1
[CI] Small Accuracy Eval Test for Deepseek Model ( #24259 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-09-15 20:14:50 -06:00
Gregory Shtrasberg
2891603efd
[ROCm][Bugfix] Fix the case where there's bias ( #24895 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-09-15 20:05:12 -06:00
Jiangyun Zhu
5bcc153d7b
[Compile] Fix noop_elimination pass and add tests for noop_elimination ( #24880 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-09-15 23:33:18 +00:00
Mickaël Seznec
45bfa49cb8
[Tests] fix initialization of kv hash in tests ( #24273 )
...
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
2025-09-15 21:48:27 +00:00
Andrew Xia
25aba2b6a3
[gpt-oss] Add IncompleteDetails to ResponsesRepsonse ( #24561 )
...
Signed-off-by: Andrew Xia <axia@meta.com>
2025-09-15 13:07:55 -07:00
Kyle Sayers
a0b26701c9
[Transform] Deterministic Hadacore Transforms ( #24106 )
...
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-09-15 12:59:31 -06:00
Harry Mellor
c4afdb69cc
Move MultiModalConfig from config/__init__.py to config/multimodal.py ( #24659 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-15 17:43:16 +00:00
Isotr0py
0e219cd50b
[Bugfix] Fix GLM4.1V multimodal processor with compatability for Transformers v4.56 ( #24822 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-15 20:45:06 +08:00
ant-yy
72c99f2a75
[Model]: support Ling2.0 ( #24627 )
...
Signed-off-by: vito.yy <vito.yy@antgroup.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-15 05:09:30 -07:00
Ning Xie
bc0f6059a2
[UT] enhance free kv cache block queue popleft_n ( #24220 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-09-15 10:04:37 +00:00
Ce Gao
f4a948f33f
[Frontend] Skip stop in reasoning content ( #14550 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-09-15 06:04:55 +00:00
Ning Xie
3f3313981c
[kv cache] update num_free_blocks in the end ( #24228 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-09-15 05:15:12 +00:00
Chen Zhang
8e5cdcda4e
[Hybrid Allocator] Support Pipeline Parallel ( #23974 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-09-14 15:55:17 -07:00
Wenlong Wang
cfa3234a5b
[CI][Spec Decode] Adjust threshold for flaky ngram spec decoding test again ( #24771 )
...
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
2025-09-13 15:45:11 +08:00
Michael Goin
59d7ffc17f
[CI Failure] Fix test_flashinfer_cutlass_mxfp4_mxfp8_fused_moe ( #24750 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-09-13 07:29:19 +00:00
Elvir Crnčević
98229db244
[Kernels][DP/EP] Optimize Silu Kernel for R1 ( #24054 )
...
Signed-off-by: elvircrn <elvircrn@gmail.com>
2025-09-13 00:17:27 -07:00
Woosuk Kwon
5febdc8750
[Chore] Remove unused batched RoPE op & kernel ( #24789 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-13 00:08:20 -07:00
Shane A
89e08d6d18
[Model] Add Olmo3 model implementation ( #24534 )
...
Signed-off-by: Shane A <shanea@allenai.org>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-13 03:26:21 +00:00
Chenheli Hua
7f2ea7074e
[Frontend][Multimodal] Allow skipping media data when UUIDs are provided. ( #23950 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-09-13 02:16:06 +00:00