dengyunyang
8f8f469b1b
[BugFix] skip language model in Encoder ( #30242 )
...
Signed-off-by: dengyunyang <584797741@qq.com>
2025-12-22 05:25:59 -08:00
Li Wang
256a33ecb4
[Model] Fix bagel failed to run ( #31132 )
...
Signed-off-by: wangli <wangli858794774@gmail.com>
2025-12-22 02:15:54 -08:00
Kevin McKay
14c3e6ade3
[Misc] Fix spelling typos in model comments ( #31117 )
...
Signed-off-by: c0de128 <kevin.mckay@outlook.com>
2025-12-21 21:14:14 -08:00
baonudesifeizhai
54c8924384
[MoE Refactor][5/N] Isolate zero expert to LongCatFlash ( #28891 )
...
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Signed-off-by: Dongjie Zou <85092850+baonudesifeizhai@users.noreply.github.com>
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
2025-12-20 18:22:04 +00:00
Yuxuan Zhang
8a7a414374
GLM-4.7 Tool Parser and Doc Update ( #30876 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
2025-12-20 00:09:58 +00:00
Wentao Ye
4cf9429897
[Bug] Fix error 'Dynamo failed to run FX node with fake tensors for Deepseek V3.2 ( #31046 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-12-19 23:31:31 +00:00
Zhonghua Deng
969bbc7c61
[Model] Add MiMo-V2-Flash support ( #30836 )
...
Signed-off-by: Abatom <abzhonghua@gmail.com>
Signed-off-by: Jumiar <liuanqim10@126.com>
Signed-off-by: Zyann7 <zyann7@outlook.com>
Co-authored-by: Jumiar <liuanqim10@126.com>
Co-authored-by: Zyann7 <zyann7@outlook.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-12-19 17:17:03 +00:00
Andreas Karatzas
7b43db210c
[ROCm][CI][Bugfix] Multi-Modal Model Support Fixes and Attention Backend Improvements ( #30270 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-19 02:17:27 +00:00
Isotr0py
700a5ad6c6
[MM Encoder]: Migrate legacy ViT MultiHeadAttention to new MMEncoderAttention interface ( #30684 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-19 02:04:19 +08:00
sarathc-cerebras
28d15ab56b
adds jais 2 support ( #30188 )
...
Signed-off-by: sarathc-cerebras <sarath.chandran@cerebras.net>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-12-18 15:46:58 +00:00
zzhxxx
b166ef20e1
[refactor] Add prefix support to embed_tokens in DeepSeek MTP ( #30788 )
...
Signed-off-by: zzhx1 <zzh_201018@outlook.com>
2025-12-18 04:45:56 +00:00
Isotr0py
6fe5887652
[Chore] Remove v0 dead code for Qwen2.5-omni ( #30883 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-17 19:54:39 -08:00
Isotr0py
74a1ac38b0
[v1] Add PrefixLM support to TritonAttention backend ( #30386 )
2025-12-17 16:05:24 -08:00
KimHyemin
196cdc3224
[Model] Gemma3: Support untied word embeddings ( #30827 )
...
Signed-off-by: www-spam <panmahm@naver.com>
2025-12-17 07:11:18 -08:00
baoqian426
84896fda22
[Bugfix] deepseek-V3.2 self.weights_proj has no bias ( #30841 )
...
Signed-off-by: baoqian <1354987947@qq.com>
Signed-off-by: baoqian426 <1354987947@qq.com>
2025-12-17 03:32:34 -08:00
Asaf Joseph Gardin
a9e15c21ef
[Mamba] Removed disable cascade attn in MambaModelConfig ( #30712 )
...
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
2025-12-17 08:48:53 +00:00
Grzegorz K. Karch
f5db6385a1
Fix nemotron_nas intermediate_size computation ( #30795 )
...
Signed-off-by: Grzegorz Karch <gkarch@nvidia.com>
2025-12-17 01:06:28 +00:00
Roger Wang
f5f51e5931
[Core][MM] Optimize encoder cache manager by operating with embeddings only ( #30475 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Sun Kim <sunytokki@gmail.com>
2025-12-16 14:18:17 -08:00
Harry Mellor
0b0acc758e
Remove head_mask from Ultravox and Swin ( #30764 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-16 08:02:41 -08:00
Harry Mellor
6f15ac5de7
Don'e assume position_embedding_type will be present for BERT and RoBERTa models ( #30770 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-16 13:40:26 +00:00
Isotr0py
e94384bbad
[Bugfix] Fix broken ViT attention selection for Blackwell device ( #30731 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-16 05:24:32 +00:00
Shanshan Shen
3bd9c49158
[CustomOp] Extract ApplyRotaryEmb as CustomOp and unify the dispatch logic ( #29873 )
...
Signed-off-by: shen-shanshan <467638484@qq.com>
Co-authored-by: gcanlin <canlinguosdu@gmail.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
2025-12-15 19:08:16 -08:00
Max Hu
3f175f18a2
[Bugfix] Fix multimodal configuration for Qwen3VL MOE model ( #30670 )
...
Signed-off-by: Max Hu <hyoung2991@gmail.com>
2025-12-15 14:06:01 +00:00
duke
e4806d973a
[BugFix] Add embed_input_ids method to make QWenLMHeadModel a vllm model ( #30674 )
...
Signed-off-by: root <iwzbi@zju.edu.cn>
Co-authored-by: root <iwzbi@zju.edu.cn>
2025-12-15 10:38:29 +00:00
wang.yuqi
4429d934de
[Model] Automatic conversion of TokenClassification model ( #30666 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-12-15 08:13:00 +00:00
汪志鹏
1adeb3b84c
[New Model] BAGEL support (AR only) ( #28439 )
...
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-15 14:58:23 +08:00
Shanshan Shen
87b4d1557d
[CustomOp][MM] Extract MMEncoderAttention as CustomOp and replace the backend of QwenVisionAttention with it. ( #30125 )
...
Signed-off-by: shen-shanshan <467638484@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-12-15 11:13:32 +08:00
ZiTian Zhao
ae88aada38
[Feature]Add EVS (Efficient Video Sampling) Support for Qwen3-VL ( #29752 )
...
Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com>
Co-authored-by: deitxfge <huhaibo1990@126.com>
2025-12-14 05:24:56 -08:00
zifeitong
48b8456ff9
[Bugfix] Revert Qwen2-VL part of change in #28271 ( #30542 )
...
Signed-off-by: Zifei Tong <zifeitong@gmail.com>
2025-12-14 05:20:08 -08:00
Ilya Markov
3224ea9915
[torch.compile] Add encoder tag for compilation ( #30489 )
...
Signed-off-by: ilmarkov <markovilya197@gmail.com>
2025-12-14 18:15:11 +08:00
Lasha Koroshinadze
3a20450d31
Add AudioFlamingo3 model support ( #30539 )
...
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>
Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-14 02:14:55 -08:00
Chen Zhang
ace34e3783
[Bugfix] Qwen3-next with --hf-overrides \{\"num_hidden_layers\":8\} ( #30433 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-12-13 22:12:45 +08:00
Cyrus Leung
64251f48df
[Chore] Adjust tokenizer import to avoid circular imports ( #30601 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-13 04:42:39 -08:00
Roberto L. Castro
4fa7ce46f3
[Feature] Add SM103 (Blackwell Ultra) Support to vLLM ( #30484 )
...
Signed-off-by: LopezCastroRoberto <robertol.c510@gmail.com>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2025-12-12 19:34:23 -08:00
Lucas Wilkinson
3e41992fec
[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 ( #27532 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-12 05:57:47 -08:00
Jaehwang Jung
f90319d5d1
[Bugfix] Schedule failure due to wrong get_image_size_with_most_features ( #29692 )
2025-12-12 02:27:20 -08:00
Michael Goin
9f2fc16a69
[Bugfix][Model] Fix Afmoe rope_parameters issue ( #30505 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-12 02:53:57 +00:00
Nicolò Lucchesi
0efd9f867c
[Core] Whisper Enable Encoder Batching ( #29421 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-12-11 21:06:51 +00:00
Harry Mellor
cf3eacfe58
Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim ( #30389 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-11 20:45:23 +00:00
Harry Mellor
8781cd6b88
Add Eagle and Eagle3 support to Transformers modeling backend ( #30340 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-11 17:02:10 +00:00
Harry Mellor
93db3256a4
Give pooling examples better names ( #30488 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-11 16:22:58 +00:00
Cyrus Leung
3a3b06ee70
[Misc] Improve error message for is_multimodal ( #30483 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-11 06:39:51 -08:00
Cyrus Leung
13d63b65e0
[Deprecation] Remove missed fallback for embed_input_ids ( #30469 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-11 10:06:36 +00:00
Cyrus Leung
979f50efd0
[Deprecation] Remove fallbacks for embed_input_ids and embed_multimodal ( #30458 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-11 06:58:23 +00:00
gh-wf
36c9ce2554
Ensure minimum frames for GLM 4.6V compatibility ( #30285 )
...
Signed-off-by: Wayne Ferguson <wayneferguson@gmail.com>
2025-12-11 05:26:49 +00:00
Anker
e8e8cd73e5
[Bugfix] Fix HunyuanOCR cross-image contamination in batch processing ( #30344 )
...
Signed-off-by: Lennart Brog <lennart.borg@list-ag.de>
Signed-off-by: Anker <20343812+anker-c2@users.noreply.github.com>
2025-12-10 18:09:31 +00:00
Roger Young
d017bceb08
[BugFix] Fix minimax m2 model rotary_dim ( #30384 )
...
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
2025-12-10 04:58:50 -08:00
haoyangli-amd
06462392e4
[bugfix][quantization] fix quark qwen3 kv_cache quantization ( #30308 )
...
Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
2025-12-10 03:24:12 +00:00
Tsukasa OI
73a484caa1
[Model][Quantization] Fix / Add GGUF support for Qwen2 MoE models ( #30307 )
...
Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>
2025-12-09 19:13:10 +00:00
wang.yuqi
9c32df6101
[Bugfix] Qwen 3 VL Embedding loading ( #30303 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-09 08:04:02 +00:00