Russell Bryant
4507a6dae4
CODEOWNERS: Add myself as reviewer on security docs ( #28216 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-11-06 17:39:42 +00:00
Roy Wang
d1dd5f53e4
[Frontend] Fix logging format when enable response logging ( #28049 )
...
Signed-off-by: esmeetu <jasonailu87@gmail.com>
2025-11-06 16:25:39 +00:00
StanHatko
e52e4da971
[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores ( #27953 )
...
Signed-off-by: Stan Hatko <stan_hatko@live.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2025-11-06 23:47:11 +08:00
Milos Puzovic
2176778cd3
[Doc] Add Arm CPUs are on the list of supported targets in vLLM ( #26018 )
...
Signed-off-by: Milos Puzovic <milos.puzovic@arm.com>
2025-11-06 15:30:26 +00:00
Eric Yue
0370679ce9
[Kernel][Model] Tune fused_moe Triton configs for MiniMax-M2 on H100 ( #28200 )
...
Signed-off-by: minatoaquaMK2 <jiacheng.yue@foxmail.com>
2025-11-06 07:29:46 -08:00
Harry Mellor
8816e375d3
[Docs] Switch to directory style URLs ( #28058 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-06 07:06:33 -08:00
Michael Goin
f32229293e
Disable nm-testing models with issues in CI ( #28206 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-06 06:19:07 -08:00
xiangze-arm
c757a15f0f
[CPU]Improve cpu fused moe perf ( #27244 )
...
Signed-off-by: Zhang Xiangze <Xiangze.Zhang@arm.com>
2025-11-06 11:04:18 +00:00
Chauncey
59a50afa08
[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony ( #26874 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-11-06 10:40:03 +00:00
courage17340
981cadb35c
[Bugfix][Kernel] fix merge attn states when both prefix and suffix are empty ( #28181 )
...
Signed-off-by: courage17340 <courage17340@163.com>
2025-11-06 17:52:13 +08:00
wangxiyuan
c3ee80a01a
[V0 deprecation]clean up is_v1_supported_oracle ( #28116 )
...
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-11-06 16:05:32 +08:00
Aditya Tewari
3755c14532
[CPU] Enable torch profiling ( #28130 )
...
Signed-off-by: Aditya Tewari <aditya.tewari@arm.com>
2025-11-06 07:32:05 +00:00
Seungduk Kim
201dc98acc
Fix hard-coded parameter name in gemma3n.py ( #27946 )
...
Signed-off-by: Seungduk Kim <seungduk.kim@yanolja.com>
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
2025-11-05 23:07:36 -08:00
Julien Denize
a404e2c0f1
Patch Mistral Tokenizer ( #28146 )
...
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
2025-11-06 06:43:16 +00:00
Xiaozhu Meng
e31946f86e
[flashinfer] fix FI all2all with FI cutlass moe ( #28166 )
...
Signed-off-by: Xiaozhu <mxz297@gmail.com>
2025-11-06 05:52:16 +00:00
gmagogsfm
bde5039325
[CI] Add compile/test_multimodal_compile.py to CI ( #28151 )
...
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-06 05:41:47 +00:00
Jacob Zhong
d72299d47b
Make the cv2 dependency optional ( #27780 )
...
Signed-off-by: Jacob <cmpute@qq.com>
2025-11-06 05:08:55 +00:00
Lukas Geiger
80679f108f
[Core][MM] Use non-blocking CPU-GPU copy of multimodal data ( #28141 )
...
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-11-06 04:05:12 +00:00
Isotr0py
43ecd0a900
[Chore] Clean up deepseek v2/v3 config copy ( #28055 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-06 03:46:30 +00:00
Chauncey
07d614511f
[Misc] Remove the duplicate code ( #28111 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-11-05 21:07:47 -05:00
Vadim Gimpelson
f948ab6945
[CI Failure] nm-testing/Qwen2-0.5B-Instruct-FP8-SkipQKV was removed from HF. Skip it in tests ( #28170 )
...
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
2025-11-06 01:22:13 +00:00
Wentao Ye
d71af5f502
[Feature] Enable TP + EP shared_experts overlap with router, 3.7% E2E performance improvement ( #28164 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-11-05 17:21:08 -08:00
Wentao Ye
90189c71a9
[Bug] Fix env string "0" same to True ( #28159 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-11-05 17:04:20 -08:00
Wentao Ye
d79d9f0780
[Bug] Fix cpu disable shared_experts VLLM_DISABLE_SHARED_EXPERTS_STREAM ( #28157 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-11-05 17:03:09 -08:00
Vadim Gimpelson
b6a248bdd7
[PERF] Decouple projections from GDN custom op. Attempt 2 ( #28083 )
...
Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>
2025-11-05 17:01:12 -08:00
Dayeol Lee
1767658559
[Debugging] Add annotation for easier trace analysis ( #22496 )
2025-11-05 16:52:52 -08:00
Kuntai Du
efe73e9b57
[Core][Hybrid allocator + connector 2/n] Unify remove_skipped_blocks by get_last_useful_token ( #25431 )
...
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
2025-11-06 00:12:00 +00:00
Zhewen Li
0b8e871e5e
[CI/Build] Fix test_defaults_with_usage_context in AMD CI ( #27926 )
...
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-11-05 15:40:24 -08:00
Zhewen Li
5ee93a5956
[CI/Build] Update checking logic in cutlass_group_gemm_supported ( #27948 )
...
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-11-05 15:40:10 -08:00
Snehlata
e15601789b
[Feature]: Add corrupted request metric to V1 metrics system. ( #27306 )
...
Signed-off-by: atalhens <sneh.lata@nutanix.com>
2025-11-05 13:45:29 -08:00
Richard Zou
65ac8d8dc4
[Docs] Add guide to debugging vLLM-torch.compile integration ( #28094 )
...
Signed-off-by: Richard Zou <zou3519@gmail.com>
2025-11-05 21:31:46 +00:00
Isotr0py
ffb08379d8
[Chore] Remove Nemotron-Nano-VL config copy ( #28126 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-05 20:06:45 +00:00
R3hankhan
e04492449e
[Hardware][IBM Z] Optimize s390x Dockerfile ( #28023 )
...
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2025-11-05 11:25:44 -08:00
Michael Yao
518ec6b722
[Docs] Clean up README_TUNING.md ( #28088 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-11-05 19:01:34 +00:00
wang.yuqi
802748bddb
[Bugfix] Fix Qwen3-Reranker-8B load ( #28117 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-11-05 18:33:50 +00:00
Paul Zhang
faedbb4d4f
[Feature] Extend batch invariant torch.compile to B200 ( #27856 )
...
Signed-off-by: PaulZhang12 <paulzhan@fb.com>
2025-11-05 10:04:49 -08:00
Samuel Shen
40db194446
[CI]: Add LMCacheConnector Unit Tests ( #27852 )
...
Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>
2025-11-05 09:45:57 -08:00
Chen Zhang
c765f0b443
[FlashInfer] Avoid FlashInfer block_size 16 + head_size 256 on blackwell ( #27994 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-11-05 09:25:32 -08:00
gmagogsfm
002b07c4b2
[Bugfix] vLLM should check Inductor config for compile cache enablement status ( #27637 )
...
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
2025-11-05 12:22:44 -05:00
Walter Beller-Morales
752ddeacaa
[Core] add support for reasoning parser plugins ( #28075 )
...
Signed-off-by: walter beller-morales <walter.beller.morales@gmail.com>
2025-11-06 01:15:06 +08:00
Jiangyun Zhu
c18f88c6ca
[Kernel] Fuse computation of g and beta for Gated Delta Net ( #28095 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-11-05 09:14:55 -08:00
Jiaju Zhang
6fd0df8132
[misc] add vLLM Beijing Meetup ( #28127 )
...
Signed-off-by: Jiaju Zhang <jjzhang@redhat.com>
2025-11-05 17:12:59 +00:00
Isotr0py
3f5a4b6473
[Bugfix] Validate custom logits processor xargs for online serving ( #27560 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-05 16:53:33 +00:00
Pleaplusone
6cae1e5332
[ROCm][MLA] Support block-size > 1 for AITER MLA backend ( #27224 )
...
Signed-off-by: ganyi <ygan@amd.com>
Co-authored-by: wuhuikx <hattie.wu@amd.com>
2025-11-05 10:43:02 -05:00
Alexei-V-Ivanov-AMD
80c9275348
Enabling cooperative multi-gpu tests on multi-gpu nodes ( #27986 )
...
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
2025-11-05 10:35:49 -05:00
Ilya Markov
e50c454672
[BugFix] Support EP/DP + EPLB with MTP ( #25311 )
...
Signed-off-by: ilmarkov <markovilya197@gmail.com>
Signed-off-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Sage Moore <sage@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
2025-11-05 15:22:17 +00:00
Chen Zhang
5d16d0fa62
[DCP] check return_lse for all layers in dcp ( #27929 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-11-05 22:27:25 +08:00
bigmoyan
0606bea2b6
add kimi reasoning parser ( #28128 )
...
Signed-off-by: wangzhengtao <wangzhengtao@msh.team>
Co-authored-by: wangzhengtao <wangzhengtao@msh.team>
2025-11-05 21:48:33 +08:00
Frost Mitchell
6e97eccf5d
[XPU] Enable custom routing functions in IPEX for Llama4 ( #28004 )
...
Signed-off-by: frost-intel <frost.mitchell@intel.com>
2025-11-05 13:39:57 +00:00
Boyuan Feng
6ab183813c
[Graph Partition][Cache] Use inductor partition ops config ( #27702 )
...
Signed-off-by: Boyuan Feng <boyuan@meta.com>
2025-11-05 13:04:48 +00:00