Chauncey
|
61fbfe5274
|
[Bugfix] fixed inconsistent finish_reason handling between V0 and V1 engines (#27555)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-28 02:18:08 +00:00 |
|
Kuntai Du
|
255e34ca50
|
[Stability fix] turn off HMA allocator when connector is set (#27592)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
2025-10-27 18:32:23 -07:00 |
|
Roger Wang
|
a8d2e326ec
|
[Bugfix][CI] Fix config resolving logic with remote models (#27610)
|
2025-10-28 00:48:32 +00:00 |
|
Andrew Xia
|
53a56e658b
|
[gpt-oss][2/N] Support input_messages in responsesRequest (#26962)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-10-27 23:15:49 +00:00 |
|
usberkeley
|
69f064062b
|
Code quality improvements: version update, type annotation enhancement, and enum usage simplification (#27581)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
|
2025-10-27 17:50:22 +00:00 |
|
Micah Williamson
|
921e78f4bb
|
[ROCm] Update AITER branch for ROCm base docker (#27586)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2025-10-27 17:22:33 +00:00 |
|
Cyrus Leung
|
6ebffafbb6
|
[Misc] Clean up more utils (#27567)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-27 15:30:38 +00:00 |
|
Ben Browning
|
3b96f85c36
|
[Chore]: Stream tokens vs characters in tool call parser tests (#26513)
Signed-off-by: Ben Browning <bbrownin@redhat.com>
|
2025-10-27 23:06:25 +08:00 |
|
tingtinggithub
|
23ad820553
|
fixing mm placeholder replacement issue with gemma3 (#27538)
Signed-off-by: tingtingtang1992 <streamttt@gmail.com>
|
2025-10-27 14:34:01 +00:00 |
|
Varun Sundar Rabindranath
|
5d3be3ba4c
|
[Bugfix][LoRA][FusedMoE] Select MxFP4 Backend based on LoRA Enablement (#27487)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
|
2025-10-27 07:32:50 -07:00 |
|
Yu Jiaqi
|
4f882be4a0
|
[Model] Siglip2 Model Support (#27566)
Signed-off-by: piood <2477084691@qq.com>
|
2025-10-27 06:57:37 -07:00 |
|
Asaf Joseph Gardin
|
9273754222
|
[Hybrid] Added supports_mamba_prefix_caching Protocol (#27339)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
|
2025-10-27 13:05:20 +00:00 |
|
Jee Jee Li
|
f4e8154076
|
[Kernel] Enable moe LoRA kernel support FP16 (#27468)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-27 19:48:37 +08:00 |
|
Fadi Arafeh
|
a663f6ae64
|
[cpu][perf] Fix low CPU utilization with VLLM_CPU_OMP_THREADS_BIND on AArch64 (#27415)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
|
2025-10-27 11:14:55 +00:00 |
|
Chauncey
|
a4fc21895e
|
[Bugfix] Fixed when return_token_ids=False, the first event still contains prompt_token_ids. (#27561)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-10-27 11:06:43 +00:00 |
|
Shanshan Shen
|
a3e8611da5
|
[Bugfix] Limit the default value of max_model_len when it is not specified by users (#27556)
Signed-off-by: shen-shanshan <467638484@qq.com>
|
2025-10-27 10:16:20 +00:00 |
|
Cyrus Leung
|
7c2bdb83dc
|
[Misc] Clean up utils (#27552)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-27 09:05:40 +00:00 |
|
Danielle Robinson
|
9932ed6a83
|
[Kernel] Adding split_K implementation for fused_moe_lora (#27291)
Signed-off-by: Danielle Robinson <dmmaddix@amazon.com>
Signed-off-by: Danielle Robinson <dcmaddix@gmail.com>
Co-authored-by: Danielle Robinson <dmmaddix@amazon.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-10-27 02:05:24 -07:00 |
|
Jee Jee Li
|
2d631d28c6
|
[Doc] Slight improvement to M2 and beyond (#27554)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-27 09:02:10 +00:00 |
|
Cyrus Leung
|
b368382964
|
[Model] Deprecate merge_by_field_config=False (#27551)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-27 16:43:00 +08:00 |
|
gnovack
|
a806c14cc7
|
[Performance][LoRA] add context varying params to 'do_not_specialize' in fused moe lora (#27445)
Signed-off-by: gnovack <gnovack@amazon.com>
|
2025-10-27 06:31:55 +00:00 |
|
yyzxw
|
181bf5bbde
|
[Docs] reemove the incorrect enable_reasoning parameter (#27550)
Signed-off-by: zxw <1020938856@qq.com>
|
2025-10-26 23:17:19 -07:00 |
|
Cyrus Leung
|
cbd5e07a51
|
[Model] Use merge_by_field_config for MM models (Qwen series) (#27546)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-27 05:38:05 +00:00 |
|
CSWYF3634076
|
63b22e0dbb
|
[Model][Bugfix] fix ernie45 moe 300B SharedFusedMoE output tuple (#27316)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2025-10-26 20:53:31 -07:00 |
|
Roger Young
|
5980604c44
|
Fix MiniMax-M2 copyright (#27537)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-10-27 03:29:51 +00:00 |
|
youkaichao
|
361a7463d3
|
fix m2 test (#27536)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-10-27 01:04:36 +08:00 |
|
Roger Young
|
720af6ab79
|
[Model][MiniMax-M2] Support MiniMax-M2 Model (#27535)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
|
2025-10-27 00:59:11 +08:00 |
|
Cyrus Leung
|
55cba4a05c
|
[CI/Build] Update causal-conv1d installation (#27529)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-26 22:14:22 +08:00 |
|
Cyrus Leung
|
c7abff2990
|
Revert "[CI/Build] Use CPU for mm processing test on CI (#27522)" (#27531)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-26 04:44:27 -07:00 |
|
Yeshwanth N
|
71b1c8b667
|
[Chore]:Extract math and argparse utilities to separate modules (#27188)
Signed-off-by: Yeshwanth Surya <yeshsurya@gmail.com>
Signed-off-by: Yeshwanth N <yeshsurya@gmail.com>
Signed-off-by: yeshsurya <yeshsurya@gmail.com>
|
2025-10-26 04:03:32 -07:00 |
|
Cyrus Leung
|
8fb7b2fab9
|
[Doc] Fix links to GH projects (#27530)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-26 17:55:51 +08:00 |
|
Cyrus Leung
|
be7b55a83d
|
[Doc] Remove Molmo warning (#27527)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-26 16:22:52 +08:00 |
|
Lucia Fang
|
315b860abe
|
[bugfix]fix empty prompts for async-engine mode in benchmark throughput (#27494)
Signed-off-by: Lucia Fang <fanglu@fb.com>
|
2025-10-26 08:16:35 +00:00 |
|
rongfu.leng
|
87c41c26ad
|
[Bugfix] Fix processor initialization for model from modelscope instead of HF (#27461)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-26 07:44:31 +00:00 |
|
JartX
|
65d2cf9511
|
[BUGFIX][ROCM] ViT FlashAttention on ROCm (no GFX9) and contiguous on qwen3vl ROCm TORCH_SDPA (#27190)
Signed-off-by: JartX <sagformas@epdcenter.es>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2025-10-26 15:08:52 +08:00 |
|
Isotr0py
|
d63cd9ff10
|
[CI/Build] Use CPU for mm processing test on CI (#27522)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-26 13:09:18 +08:00 |
|
Cyrus Leung
|
66a168a197
|
[CI/Build] Refactor processing tests (#27470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-25 16:14:30 +00:00 |
|
Matthew Bonanni
|
a99564ac5b
|
[Attention] Add missing kv cache scale setup (#27490)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-10-25 00:12:49 -07:00 |
|
Cyrus Leung
|
4c5f632165
|
[Misc] Simplify max tokens in multimodal registry (#27500)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-24 23:56:01 -07:00 |
|
Kuntai Du
|
b853540388
|
[Core][Hybrid allocator + kv connector 1/n] Enable hybrid allocator + KV cache connector (#25712)
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
2025-10-24 23:34:18 -07:00 |
|
Zhuohan Li
|
56ed7609a9
|
Revert "[Misc] Remove use of CUDA_VISIBLE_DEVICES for device selectio… (#27502)
|
2025-10-25 05:31:43 +00:00 |
|
Jiangyun Zhu
|
29c9cb8007
|
[CI] Add tests for cudagraph (#27391)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2025-10-25 02:37:33 +00:00 |
|
Yihua Cheng
|
83f478bb19
|
[KVConnector] Migrate the LMCache integration code to be vLLM native (#25542)
Signed-off-by: ApostaC <yihua98@uchicago.edu>
|
2025-10-25 00:23:53 +00:00 |
|
Varun Sundar Rabindranath
|
269c4db0a4
|
[Misc][DP] Guard mxfp4 implementation selection (#27484)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
|
2025-10-24 23:29:24 +00:00 |
|
Wentao Ye
|
52efc34ebf
|
[Log] Optimize Startup Log (#26740)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-24 19:27:04 -04:00 |
|
Pengchao Wang
|
d95d0f4b98
|
[Distributed] Basic set of configuration for large EP deployment on GB200 (#27328)
Signed-off-by: Pengchao Wang <wpc@fb.com>
Co-authored-by: Lu Fang <30275821+houseroad@users.noreply.github.com>
|
2025-10-24 14:16:44 -07:00 |
|
Lehua Ding
|
0402428200
|
[Perf][Async Scheduling] Remove CPU->GPU sync in dummy_run (#27455)
Signed-off-by: Lehua Ding <lehuading@tencent.com>
|
2025-10-24 20:45:36 +00:00 |
|
jinghanhu
|
17af6aa0da
|
[Document] Add ms-swift library to rlhf.md (#27469)
|
2025-10-24 20:31:50 +00:00 |
|
Zhewen Li
|
fc168c33f3
|
[CI/Build] Fix test_torch_utils in AMD CI (#27317)
Signed-off-by: zhewenli <zhewenli@meta.com>
|
2025-10-24 12:26:00 -07:00 |
|
Isotr0py
|
acc78aeb88
|
[Bugfix] Fix interns1-vit qk norm code path (#27480)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-24 17:43:45 +00:00 |
|