amitz-nv
6038b1b04b
[Frontend][Model] Add 'float16' to possible mamba cache dtype values, override mamba SSM cache dtype value for NemotronH ( #29978 )
...
Signed-off-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
2025-12-05 00:34:33 -08:00
Harry Mellor
e10c84e06a
Access partial_rotary_factor from rope_parameters ( #29966 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-04 18:42:49 +00:00
Tao Yun
6dcb07f676
support qwen3-vl handle requests with embeddings ( #30037 )
...
Signed-off-by: taoyun <1069423820@qq.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-12-04 17:34:06 +00:00
Cyrus Leung
b286a311c2
[Chore] Deprecate merge_by_field_config arg ( #30035 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-04 17:21:24 +00:00
Harry Mellor
9998ea5b57
Delete HF version of Phi 4 MM ( #30049 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-04 13:44:50 +00:00
wang.yuqi
74c4d80c6c
[Model][6/N] Improve all pooling task | Support chunked prefill with ALL pooling ( #27145 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-04 13:44:15 +00:00
Cyrus Leung
68eb5c8d97
[Misc] Move functions into PoolingMetadata ( #30027 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-04 08:21:19 +00:00
TJian
3f1b03739a
[ROCm] [Bugfix] compute_attn_mask_seqlen for qwen3 omni ( #29974 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-12-04 08:20:24 +00:00
Cyrus Leung
9ae2f60374
[Misc] Various cleanups for MM input processing ( #29970 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-04 06:22:20 +00:00
Isotr0py
a21cd9ed23
[Bugfix] Fix incorrect image_grid_thw rank for HunyuanOCR from missing merge_by_field_config=True ( #29950 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-03 10:05:10 +00:00
Julien Denize
5e5646e206
[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention ( #29908 )
...
Signed-off-by: juliendenize <julien.denize@mistral.ai>
2025-12-02 14:51:20 -08:00
Harry Mellor
6fc5841db1
Fix some more Transformers nightly tests ( #29872 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-02 21:49:44 +00:00
Navanit Dubey
a2b053dc85
feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE ( #29896 )
...
Signed-off-by: navanit-git <navanitdubey@gmail.com>
2025-12-02 19:28:35 +00:00
Isotr0py
0ec8422171
[Bugfix] Fix incorrect channel order for idefics3 in edge case ( #29881 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-02 16:03:52 +00:00
Matthew Bonanni
51c57b51dd
[Bugfix] Fix DeepSeek R1 MTP weight loading ( #29545 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
2025-12-02 15:52:18 +00:00
Cyrus Leung
68ffbca7e4
[Chore] Use tokenizer.encode and tokenizer.decode directly ( #29851 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-02 12:30:40 +00:00
Julien Denize
d8c6210eea
Add Mistral Large 3 and Ministral 3 ( #29757 )
...
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Mickael Seznec <mickael@mistral.ai>
2025-12-02 10:29:00 +00:00
Harry Mellor
f5b0846ba0
Fix some Transformers nightly tests ( #29802 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-02 07:05:27 +00:00
Cyrus Leung
653591d5e7
[Chore] Move tokenizer initialization methods ( #29793 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-02 13:33:37 +08:00
Johnny Yang
f441d36cee
Add missing return in _check_vllm_model_embed_input_ids ( #29834 )
...
Signed-off-by: Johnny Yang <johnnyyang@google.com>
2025-12-01 19:22:50 -08:00
sangbumlikeagod
092bb73b8a
[Frontend] add 'verbose_json' and 'timestamp' feature on Whisper Transcription/Translation ( #24209 )
...
Signed-off-by: sangbumlikeagod <oironese@naver.com>
Signed-off-by: sangbumlikeagod <98077576+sangbumlikeagod@users.noreply.github.com>
2025-12-01 18:19:17 +01:00
Xingyu Liu
21c2627934
[Misc]Remove redundant hidden_size property in ModelConfig ( #29749 )
...
Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-30 17:14:23 +00:00
Cyrus Leung
64bc09ba27
[Core] Enable inputs_embeds_size separate from hidden_size ( #29741 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-30 17:31:12 +08:00
Cyrus Leung
fe3398fab2
[Chore] Enable passing tokenizer=None into MM processor ( #29724 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-29 06:25:10 -08:00
Cyrus Leung
34a984274e
[Misc] Refactor tokenizer interface ( #29693 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-29 04:02:21 -08:00
Jee Jee Li
39e63dec7c
[LoRA] Cleanup LoRA unused code ( #29611 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-28 22:52:58 -08:00
Jiangyun Zhu
a51f4186f2
[Bugfix] fix dots.llm1.inst ( #29687 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-28 15:25:26 -08:00
Cyrus Leung
7675ba30de
[Misc] Remove redundant ClassRegistry ( #29681 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-28 15:24:47 -08:00
Isotr0py
f946a8d743
[Chore]: Reorganize model repo operating functions in transformers_utils ( #29680 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-28 08:46:51 -08:00
Didier Durand
fae6943068
[Doc]: fixing typos in multiple files. ( #29685 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-28 08:41:41 -08:00
Mingyuan Ma
460d8bbf2d
Remove upstream fa checks ( #29471 )
...
Signed-off-by: mingyuanm <mingyuanm@nvidia.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-28 05:52:42 -08:00
Cyrus Leung
33b06a6f24
[Misc] Remove redundant attention var constants ( #29650 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-28 04:35:19 -08:00
Filipp Fisin
5f5521bd5d
Fix parameter order in GPT-OSS weight loading function for non-MXFP4 weights ( #29506 )
...
Signed-off-by: Filipp Fisin <48059208+qGentry@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-28 00:45:10 -08:00
Cyrus Leung
b34e8775a3
Revert "[CPU]Update CPU PyTorch to 2.9.0 ( #29589 )" ( #29647 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 22:43:18 -08:00
wang.yuqi
f4b76056ee
Improve enable chunked_prefill & prefix_caching logic. ( #26623 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-27 22:05:48 -08:00
EanWang211123
37b15e97e8
[Multimodal][Speculative Decoding]Eagle3 mm support, enablement on qwen3vl ( #29594 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
Signed-off-by: EanWang211123 <wangyiheng@sangfor.com.cn>
Co-authored-by: Louie Tsai <louie.tsai@intel.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-27 22:05:45 -08:00
Cyrus Leung
a24ea5414b
[Deprecation] Advance deprecation status ( #29617 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 19:04:58 +00:00
Cyrus Leung
ee9841daa9
[Bugfix] Fix doc build on main ( #29619 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 09:08:08 -08:00
Didier Durand
66d3d5422c
[Doc]: fixing typos in diverse files ( #29492 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-27 07:15:50 -08:00
Jee Jee Li
2f5f9acd55
[LoRA] Continue optimizing MoE LoRA weight loading ( #29322 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-27 05:56:28 -08:00
Roger Wang
cf348c8d27
[Bugfix] Fix HunyuanVL XD-RoPE ( #29593 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored by: grider-transwithai <grider@transwith.ai>
2025-11-27 12:36:24 +00:00
Cyrus Leung
00d3310d2d
[Bugfix] Update Ultravox compatibility ( #29588 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 01:36:18 -08:00
Matthew Bonanni
430dd4d9eb
[Attention] Remove imports from vllm/attention/__init__.py ( #29342 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-11-26 10:53:15 -07:00
Yejing Lai
bb706d6048
Fix TeleChatForCausalLM not register issue ( #29473 )
...
Signed-off-by: Lai, Yejing <yejing.lai@intel.com>
2025-11-26 05:15:00 -08:00
Cyrus Leung
e30859dff3
[Bugfix] Fix handling of image embeds in models ( #29480 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-26 05:00:15 -08:00
Roger Wang
452a7c9f7c
[Misc] Allow LM only loading for Pixtral ( #29451 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-26 05:00:00 -08:00
George D. Torres
56531b79cc
[Misc] Add backup hash algorithm for FIPS constrained environments ( #28795 )
...
Signed-off-by: George D. Torres <gdavtor@gmail.com>
Signed-off-by: George D. Torres <41129492+geodavic@users.noreply.github.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-11-26 00:50:22 +00:00
Harry Mellor
0353d2e162
Fix RoPE related failures in Transformers nightly tests ( #29333 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-25 16:23:45 +00:00
Yifan Qiao
48ddb02b79
[Hybrid Allocator] Support KV cache groups with different block_size ( #29143 )
...
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
2025-11-25 10:30:57 -05:00
Isotr0py
92effb07a4
[Model] Add HunyuanOCR support ( #29327 )
...
Signed-off-by: manayang <jackmanayang@gmail.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: sergeywang <sergeywang@tencent.com>
Co-authored-by: manayang <jackmanayang@gmail.com>
Co-authored-by: manayang <manayang@tencent.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-25 03:28:51 +00:00