杨奇(yann qi)
655a09f653
[Model][VLM] Support R-4B Model ( #23246 )
...
Signed-off-by: yannqi <yannqi@qq.com>
Signed-off-by: 杨奇(yann qi) <51905299+yannqi@users.noreply.github.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: yannqiyang <yannqiyang@tencent.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-08-21 04:08:52 +00:00
Asaf Joseph Gardin
3663870c72
[V1][Mamba1] - Full CUDA and Piecewise CUDA Graphs Support ( #23035 )
...
Signed-off-by: asafg <asafg@ai21.com>
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
Co-authored-by: asafg <asafg@ai21.com>
2025-08-20 20:08:51 -07:00
Cyrus Leung
2461d9e562
[CI/Build] Split out mm processor tests ( #23260 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 20:05:20 -07:00
Cyrus Leung
4449235843
[Bugfix] Ensure correctness of HCXVision processing ( #23254 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 14:19:30 +00:00
xyxinyang
7cd17e22d7
[Model][V1] Support Ernie MTP ( #22169 )
...
Signed-off-by: zhouchong <zhouchong03@baidu.com>
Co-authored-by: zhouchong <zhouchong03@baidu.com>
2025-08-20 20:41:55 +08:00
Cyrus Leung
68fcd3fa73
[Bugfix] Ensure correctness of Cohere2Vision processing ( #23245 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 11:09:18 +00:00
Xin Yang
83e69a09d6
[Model] Support deepseek with eagle ( #21086 )
...
Signed-off-by: Xin Yang <xyangx@amazon.com>
2025-08-20 19:01:31 +08:00
Cyrus Leung
de7b67a023
[CI/Build] Sync multimodal tests ( #23181 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 05:06:42 +00:00
Cyrus Leung
64ab3c7253
[Doc] Update V1 status of various pooling models ( #23189 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-20 10:33:41 +08:00
Isotr0py
d6a1a20973
[CI/Build] Update transformers to v4.55.2 ( #23093 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-19 10:06:17 -07:00
myselvess
b87cb97a53
[Model] support new model ovis2.5 ( #23084 )
...
Signed-off-by: myselvess <244285088@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-19 13:12:59 +00:00
wang.yuqi
f856c33ce9
[Model] Add multi_label_classification support ( #23173 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-19 12:54:30 +00:00
Isotr0py
31fd3265c8
[Bugfix] Fix broken Minimax-01-VL model ( #22116 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-19 08:49:29 +00:00
Woosuk Kwon
14006840ea
[V0 Deprecation] Remove V0 FlashInfer attention backend ( #22776 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-18 19:54:16 -07:00
杨朱 · Kiki
569aefd134
chore: remove unnecessary patch_padding_side for the chatglm model ( #23090 )
...
Signed-off-by: carlory <baofa.fan@daocloud.io>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-18 12:32:13 +00:00
Cyrus Leung
27e8d1ea3e
[Refactor] Define MultiModalKwargsItems separate from MultiModalKwargs ( #23053 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-18 09:52:00 +00:00
Michael Goin
4fc722eca4
[Kernel/Quant] Remove AQLM ( #22943 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-08-16 19:38:21 +00:00
汪志鹏
829bbd7882
[New Model]mBART model ( #22883 )
...
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
2025-08-16 12:16:58 +00:00
Isotr0py
2dbccce8a6
[CI][Bugfix] Skip Ovis2 generation test because of broken remote code ( #22954 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-16 09:44:19 +00:00
Isotr0py
cc826a202b
[Multimodal] Update Tensor schema test to cover arbitrary shape mm inputs ( #22867 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-16 00:44:50 -07:00
Thomas Parnell
75531a6c13
[V1] [Hybrid] Support using float32 for state in Hybrid Models (Mamba2, Mamba1, Minimax) ( #22928 )
...
Signed-off-by: Daniel Afrimi <danielafrimi8@gmail.com>
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: Daniel Afrimi <danielafrimi8@gmail.com>
Co-authored-by: Burkhard Ringlein <ngl@zurich.ibm.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
2025-08-15 12:57:06 +00:00
amirai21
fe91ce9591
[V1] - Split Prefill and Decode for Mamba1 models ( #22653 )
...
Signed-off-by: amirk <amirk@ai21.com>
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
Co-authored-by: Asaf Joseph Gardin <39553475+Josephasafg@users.noreply.github.com>
2025-08-15 08:59:52 +00:00
wang.yuqi
5406ebf5c9
[CI] Pooling models mteb test uses enforce_eager ( #22878 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-15 01:16:15 -07:00
Thomas Parnell
ab9f2cfd19
[CI] [Hybrid] Bump min transformers version for Bamba and Jamba ( #22908 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-14 11:01:16 -07:00
Isotr0py
7c3a0741c6
[Bugfix] Fix PixtralHFImagePixelInputs dynamic shape check ( #22827 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-14 02:35:43 -07:00
Cyrus Leung
0ca2393b47
[CI/Build] Increase pooling tolerance to pass CI ( #22844 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-08-13 18:52:48 -04:00
Isotr0py
df0e0f023e
[CI/Build] Skip gpt_big model test because of broken HF model ( #22848 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-13 20:36:28 +00:00
Cyrus Leung
c9232d41f4
[CI/Build] Update VLM common tests ( #22841 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-13 10:03:05 -07:00
Duc-Viet Hoang
a01e0018b5
[Bugfix] Fix Nemotron VL image processing ( #22739 )
...
Co-authored-by: ducviet00-h2 <viet.d.hoang@h2corporation.jp>
2025-08-13 03:11:36 -07:00
Woosuk Kwon
71683ca6f6
[V0 Deprecation] Remove multi-step scheduling ( #22138 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-08-12 20:18:39 -07:00
Harry Mellor
80bb1e8afe
Officially support SmolLM3 using the Transformers backend ( #22665 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-12 05:38:48 -07:00
dongluw
9f909b8996
[New Model] Support Command-A-Vision ( #22660 )
...
Signed-off-by: donglu <donglu@cohere.com>
2025-08-12 01:39:54 -07:00
wang.yuqi
6d729c43fb
[Bugfix] Fix ModernBert load & Enable sliding window attention for bidirectional attention. ( #22637 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
2025-08-12 00:23:17 -07:00
22quinn
807d21b80d
[BugFix] [Spec Decode] Remove LlamaForCausalLMEagle3 to fix CI ( #22611 )
...
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-08-11 10:31:36 -07:00
Isotr0py
c90fb03df5
[CI/Build] Skip Mllama HF runner tests with Transformers v4.55.0 ( #22659 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-08-11 10:00:58 -07:00
wang.yuqi
84cf78acee
[Model] Pooling models default to using chunked prefill & prefix caching if supported. ( #20930 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-11 09:41:37 -07:00
Maximilien de Bayser
39052dbca8
Support token_type_ids in V1 with less code changes ( #21985 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2025-08-10 22:54:59 -07:00
Isotr0py
049c245143
[Misc] Replace flaky image urls in pixtral test ( #22574 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-08-10 06:18:21 -07:00
Le Chen
3d7363e61c
[Config] add "qwen" as a native eagle3 target supported model ( #22333 )
...
Signed-off-by: lechen <lecself@163.com>
Signed-off-by: LeChen <lecself@163.com>
2025-08-09 20:21:05 -07:00
Thomas Parnell
61f67d8acd
[V1] [Hybrid] Enable Full CUDA Graph (decode-only) for Mamba layers ( #21401 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-09 20:16:11 -07:00
Nicolò Lucchesi
5a16fa614c
[Model] Gemma3n MM ( #20495 )
...
Signed-off-by: ShriKode <shrikode@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Co-authored-by: ShriKode <shrikode@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
2025-08-09 09:56:25 -07:00
Thomas Parnell
1bf5e1f25b
[CI] [Hybrid] Speed up hybrid models test by removing large models ( #22563 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-09 02:04:42 -07:00
Yuxuan Zhang
a6022e6fbc
GLM-4.5V with new class name at transformers ( #22520 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-09 00:50:21 -07:00
Isotr0py
7920e9b1c5
[Bugfix] Fix failing GPT-OSS initialization test ( #22557 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-09 00:03:26 -07:00
Thomas Parnell
8a0ffd6285
Remove mamba_ssm from vLLM requirements; install inside test container using --no-build-isolation ( #22541 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-08 23:05:32 -07:00
Isotr0py
429e4e2d42
[Bugfix] Fix ModernBert cuda graph capturing in v1 ( #21901 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-08-08 22:17:22 -07:00
Harry Mellor
41b9655751
Skip Qwen 1 in CI because remote code is no longer compatible with Transformers ( #22536 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-08 16:20:58 -07:00
Cyrus Leung
139d155781
[Frontend] Use engine argument to control MM cache size ( #22441 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-07 09:47:10 -07:00
Cyrus Leung
766bc8162c
[Core] Store only the keys for multi-modal data in P0 ( #22198 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-07 01:45:04 -07:00
wang.yuqi
2a4c825523
[CI] Skip the pooling models that do not support transformers v4.55 ( #22411 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-06 23:05:03 -07:00