Isotr0py
f946a8d743
[Chore]: Reorganize model repo operating functions in transformers_utils ( #29680 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-28 08:46:51 -08:00
Didier Durand
fae6943068
[Doc]: fixing typos in multiple files. ( #29685 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-28 08:41:41 -08:00
Mingyuan Ma
460d8bbf2d
Remove upstream fa checks ( #29471 )
...
Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-28 05:52:42 -08:00
Cyrus Leung
33b06a6f24
[Misc] Remove redundant attention var constants ( #29650 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-28 04:35:19 -08:00
Filipp Fisin
5f5521bd5d
Fix parameter order in GPT-OSS weight loading function for non-MXFP4 weights ( #29506 )
...
Signed-off-by: Filipp Fisin <48059208+qGentry@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-28 00:45:10 -08:00
Cyrus Leung
b34e8775a3
Revert "[CPU]Update CPU PyTorch to 2.9.0 ( #29589 )" ( #29647 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 22:43:18 -08:00
wang.yuqi
f4b76056ee
Improve enable chunked_prefill & prefix_caching logic. ( #26623 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-27 22:05:48 -08:00
EanWang211123
37b15e97e8
[Multimodal][Speculative Decoding]Eagle3 mm support, enablement on qwen3vl ( #29594 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
Signed-off-by: EanWang211123 <wangyiheng@sangfor.com.cn>
Co-authored-by: Louie Tsai <louie.tsai@intel.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-27 22:05:45 -08:00
Cyrus Leung
a24ea5414b
[Deprecation] Advance deprecation status ( #29617 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 19:04:58 +00:00
Cyrus Leung
ee9841daa9
[Bugfix] Fix doc build on main ( #29619 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 09:08:08 -08:00
Didier Durand
66d3d5422c
[Doc]: fixing typos in diverse files ( #29492 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-27 07:15:50 -08:00
Jee Jee Li
2f5f9acd55
[LoRA] Continue optimizing MoE LoRA weight loading ( #29322 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-27 05:56:28 -08:00
Roger Wang
cf348c8d27
[Bugfix] Fix HunyuanVL XD-RoPE ( #29593 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored by: grider-transwithai <grider@transwith.ai>
2025-11-27 12:36:24 +00:00
Cyrus Leung
00d3310d2d
[Bugfix] Update Ultravox compatibility ( #29588 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 01:36:18 -08:00
Matthew Bonanni
430dd4d9eb
[Attention] Remove imports from vllm/attention/__init__.py ( #29342 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-11-26 10:53:15 -07:00
Yejing Lai
bb706d6048
Fix TeleChatForCausalLM not register issue ( #29473 )
...
Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
2025-11-26 05:15:00 -08:00
Cyrus Leung
e30859dff3
[Bugfix] Fix handling of image embeds in models ( #29480 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-26 05:00:15 -08:00
Roger Wang
452a7c9f7c
[Misc] Allow LM only loading for Pixtral ( #29451 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-26 05:00:00 -08:00
George D. Torres
56531b79cc
[Misc] Add backup hash algorithm for FIPS constrained environments ( #28795 )
...
Signed-off-by: George D. Torres <gdavtor@gmail.com>
Signed-off-by: George D. Torres <41129492+geodavic@users.noreply.github.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-11-26 00:50:22 +00:00
Harry Mellor
0353d2e162
Fix RoPE related failures in Transformers nightly tests ( #29333 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-25 16:23:45 +00:00
Yifan Qiao
48ddb02b79
[Hybrid Allocator] Support KV cache groups with different block_size ( #29143 )
...
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
2025-11-25 10:30:57 -05:00
Isotr0py
92effb07a4
[Model] Add HunyuanOCR support ( #29327 )
...
Signed-off-by: manayang <jackmanayang@gmail.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: sergeywang <sergeywang@tencent.com>
Co-authored-by: manayang <jackmanayang@gmail.com>
Co-authored-by: manayang <manayang@tencent.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-25 03:28:51 +00:00
Hanjie Qiu
5f9679a43b
[Spec Decode] Add support for EAGLE3 heads that do not use_aux_hidden_states ( #27688 )
...
Signed-off-by: hjjq <hanjieq@nvidia.com>
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
2025-11-24 20:13:12 -05:00
Yan Ma
3cfa63ad99
[XPU]fix Kimi-VL-A3B-thinking on xpu ( #29309 )
...
Signed-off-by: Yan Ma <yan.ma@intel.com>
2025-11-24 21:02:21 +00:00
Chenheli Hua
839c6b7b72
[Multimodal][Qwen3 Omni] Make Qwen3 Omni work with audio-in-video inputs in V1 engine. ( #27721 )
...
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-24 19:24:37 +00:00
Laith Sakka
7a228b5305
Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. ( #26199 )
...
Signed-off-by: Laith Sakka <lsakka@meta.com>
2025-11-24 10:12:41 -05:00
杰兮
8005e606bf
[Bugfix][Rocm] Fix shared expert weight loading failure in DeepSeek-MTP ( #27563 )
...
Signed-off-by: zhyajie <yajizhan@amd.com>
Co-authored-by: zhyajie <yajizhan@amd.com>
2025-11-24 10:16:52 +00:00
Roger Wang
0ff70821c9
[Core] Deprecate xformers ( #29262 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Zero
30854783ad
[Model] Add OpenCUA-7B support ( #29068 )
...
Signed-off-by: lim4349 <rockmanzero@naver.com>
Signed-off-by: Zero <rockmanzero@naver.com>
Co-authored-by: Cloud User <ubuntu@a100-80g-4.novalocal>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-24 10:27:55 +08:00
Jee Jee Li
1073ba68b0
[LoRA] Optimize 3D MoE logic ( #29222 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-24 10:27:23 +08:00
ZiTian Zhao
d84d8f4429
Fix EVS crash when using video_embeds inputs in Qwen2.5-VL ( #29232 )
...
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-22 06:48:59 -08:00
Russell Bryant
cca2d2cdbe
[Core] Align whisper closer to other multimodal models ( #27292 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-11-21 12:01:54 +00:00
Cyrus Leung
aab0102a26
[V0 deprecation] Remove more V0 references ( #29088 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 11:56:59 +00:00
Huamin Li
8ac3a41487
[CI Failure] Fix Gemma3 RoPE configuration for sliding attention layers ( #29111 )
...
Signed-off-by: Huamin Li <3ericli@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-20 23:53:30 -08:00
Cyrus Leung
0e741c12e3
[Bugfix] Fix Plamo3 rope handling ( #29092 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-21 11:38:35 +08:00
Jee Jee Li
9875be6431
[LoRA][2/2]Remove LoRA extra vocab ( #28545 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-21 09:46:43 +08:00
Fanli Lin
a2e9ebe9e2
[BugFix] Fix flash_attn import in siglip2navit.py ( #29082 )
...
Signed-off-by: Fanli Lin <fanli.lin@intel.com>
2025-11-20 12:14:29 +00:00
Zhewen Li
93c8672ceb
[Bugfix] Fix spec decode memory regression after #28549 ( #28819 )
...
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-11-20 19:05:50 +08:00
Shinichi Hemmi
c9e093116c
[MODEL] Implement plamo3 ( #28834 )
...
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
2025-11-20 03:00:19 -08:00
Pleaplusone
06c20c9904
[ROCm] Add AMD GPU support on Deepseek v3.2 and SparseMLA ( #26670 )
...
Signed-off-by: ganyi <ygan@amd.com>
2025-11-20 02:54:01 -08:00
Anna Shors
6eb745d9bd
Add truncate arg to yarn to match openai implementation of gpt-oss ( #28244 )
...
Signed-off-by: ashors1 <ashors@nvidia.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
2025-11-20 18:53:50 +08:00
Dezhan
dc45efc8ef
[BugFix] Fix Llama4 Pipeline Parallelism Assert Error ( #28577 )
...
Co-authored-by: Dezhan Tu <dztu@meta.com>
2025-11-20 02:52:36 -08:00
Pleaplusone
7218f83992
[ROCm][BugFix] Fix shared expert loading error when disable VLLM_ROCM_USE_AITER_FUSION_SHARED_EXPERTS ( #28633 )
...
Signed-off-by: ganyi <ygan@amd.com>
2025-11-20 14:50:23 +07:00
Lukas Geiger
a9705a290a
[Model][QwenVL] Replace torch.repeat_interleave with faster np.repeat ( #28964 )
...
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-11-19 22:04:23 -08:00
Isotr0py
64192d5624
[Bugfix] Revert custom attention mask for gemma3-mm ( #28995 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-20 13:23:22 +08:00
Wentao Ye
5031cd5d55
[Refactor] Optimize select_experts ( #28069 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-11-19 18:53:15 -05:00
JartX
8e38e99829
[Feature] EPLB on Qwen3VLMoe and CompressedTensorsWNA16MoEMethod ( #28849 )
2025-11-19 18:30:08 -05:00
Wentao Ye
0075bfffd4
[CI] Fix precommit rope_theta issue ( #29040 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-11-19 14:22:43 -08:00
Yongye Zhu
88f5b19f0b
[DeepSeek] Fix DeepSeek V3.2 Rope Embedding ( #28968 )
...
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
2025-11-19 16:30:04 -05:00
Qiu
2fd893b4ce
[Feature] Prefill Context Parallel (PCP) basic support ( #28718 )
...
Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Signed-off-by: FENP <yuanyongjie.yyj@antgroup.com>
Signed-off-by: LookAround <lixushi@huawei.com>
Signed-off-by: Jingchun Gao <gaojingchun1@huawei.com>
Signed-off-by: zhenwenqi2024 <zhenwenqi_2022@qq.com>
Co-authored-by: FENP <yuanyongjie.yyj@antgroup.com>
Co-authored-by: LookAround <lixushi@huawei.com>
Co-authored-by: Jingchun Gao <gaojingchun1@huawei.com>
Co-authored-by: zhenwenqi2024 <zhenwenqi_2022@qq.com>
Co-authored-by: Jingchun Gao <63247409+gjc0824@users.noreply.github.com>
2025-11-19 15:52:44 -05:00