Cyrus Leung
139d155781
[Frontend] Use engine argument to control MM cache size ( #22441 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-07 09:47:10 -07:00
Cyrus Leung
766bc8162c
[Core] Store only the keys for multi-modal data in P0 ( #22198 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-08-07 01:45:04 -07:00
wang.yuqi
2a4c825523
[CI] Skip the pooling models that do not support transformers v4.55 ( #22411 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-06 23:05:03 -07:00
tc-mb
41b67f4263
[model] Support MiniCPM-V 4.0 ( #22166 )
...
Co-authored-by: imning3 <hbning@pku.edu.cn>
2025-08-06 18:35:46 -07:00
Asaf Joseph Gardin
46a13949d5
[v1] - Mamba1 Attention Metadata ( #21249 )
...
Signed-off-by: asafg <asafg@ai21.com>
Co-authored-by: asafg <asafg@ai21.com>
2025-08-06 17:03:42 -07:00
Isotr0py
fa00c5d75b
[Misc] Clean up duplicated hf overrides ( #22311 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-08-06 07:50:25 +00:00
Woosuk Kwon
de98252f49
Add GPT-OSS model code and config [1/N] ( #22327 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-05 23:26:00 -07:00
Harry Mellor
796bae07c5
Update transformers to v4.55 ( #21931 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-05 22:56:14 -07:00
wang.yuqi
586f286789
[Model] Pooling model activation supports per request control by PoolingParams ( #20538 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-05 00:37:00 -07:00
Yuxuan Zhang
6fa41e0c32
self.gate dtype update for GLM-4.5 ( #22203 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
2025-08-04 19:12:38 -07:00
Raghav Ravishankar
a5fff3bd49
Fix Arcee model weight loading: Add custom load_weights ( #21725 )
...
Signed-off-by: alyosha-swamy <raghav@arcee.ai>
2025-08-04 04:09:56 -07:00
Jee Jee Li
a7b8788d2c
[Misc] Modify the organization of GLM series ( #22171 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-03 23:51:20 -07:00
Isotr0py
6a39ba85fe
[Bugfix] Fix failing multimodal standard test ( #22153 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-03 19:04:38 +00:00
Isotr0py
3dddbf1f25
[Misc] Add tensor schema test coverage for multimodal models ( #21754 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-08-03 00:52:14 -07:00
Yuxuan Zhang
25373b6c6c
for glm-4.1V update ( #22000 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-08-02 01:46:57 -07:00
Roger Wang
067c34a155
docs: remove deprecated disable-log-requests flag ( #22113 )
...
Signed-off-by: Roger Wang <hey@rogerw.me>
2025-08-02 00:19:48 -07:00
Harry Mellor
38c8bce8b6
Enable headless models for pooling in the Transformers backend ( #21767 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-01 10:31:29 -07:00
Isotr0py
3f8e952179
[Bugfix] Fix glm4.1v video inference issue ( #22067 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-08-01 09:33:30 -07:00
Cyrus Leung
82de9b9d46
[Misc] Automatically resolve HF processor init kwargs ( #22005 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-31 22:44:10 -07:00
Song
9484641616
[Model] Add step3 vl ( #21998 )
...
Signed-off-by: oliveryuan <yuansong@step.ai>
Co-authored-by: oliveryuan <yuansong@step.ai>
2025-07-31 23:19:06 +08:00
wang.yuqi
2836dd73f1
[Model][CI] Let more pooling models support v1 ( #21747 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-31 01:51:15 -07:00
Cyrus Leung
004203e953
[CI/Build] Fix registry tests ( #21934 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-30 09:10:41 -07:00
Isotr0py
6e599eebe8
[Bugfix] Fix OOM tests in initialization test ( #21921 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-07-30 07:35:47 -07:00
Varun Vinayak Shenoy
547795232d
[Tests] Fixing bug inside MultiModalProfiler. ( #21842 )
...
Signed-off-by: Varun Shenoy <varun.vinayak.shenoy@oracle.com>
2025-07-30 00:44:15 -07:00
wang.yuqi
65f311ce59
[Frontend] Add LLM.reward specific to reward models ( #21720 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-07-29 20:56:03 -07:00
Reza Barazesh
37efc63b64
[V0 deprecation] Guided decoding ( #21347 )
...
Signed-off-by: Reza Barazesh <rezabarazesh@meta.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 03:15:30 -07:00
Cyrus Leung
04fe61aa3d
[CI/Build] Fix plugin tests ( #21758 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-28 15:08:05 +00:00
Anton Vlasjuk
656c24f1b5
[Ernie 4.5] Name Change for Base 0.3B Model ( #21735 )
...
Signed-off-by: vasqu <antonprogamer@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-28 12:22:32 +00:00
Cyrus Leung
a4ed731546
[Model] Prioritize Transformers fallback over suffix matching ( #21719 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-28 02:15:31 -07:00
Cyrus Leung
86ae693f20
[Deprecation][2/N] Replace --task with --runner and --convert ( #21470 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-27 19:42:40 -07:00
Isotr0py
3d847a3125
[VLM] Add video support for Intern-S1 ( #21671 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-27 11:49:43 +00:00
Isotr0py
eed2f463b2
[VLM] Support HF format Phi-4-MM model ( #17121 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-07-26 20:07:57 -07:00
Maximilien de Bayser
1cd6eaba54
Support encoder-only models without KV-Cache ( #21270 )
...
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-07-26 21:09:52 +08:00
Isotr0py
f27fdfc3ed
[Bugfix] Investigate Qwen2-VL failing test ( #21527 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-26 06:09:29 -07:00
Lyu Han
875af38e01
Support Intern-S1 ( #21628 )
...
Signed-off-by: Roger Wang <hey@rogerw.me>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-07-26 19:14:04 +08:00
Farzad Abdolhosseini
62965de5fe
[Model] Ultravox: Support Llama 4 and Gemma 3 backends ( #17818 )
...
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
Signed-off-by: Patrick Li <patrick8289@gmail.com>
Co-authored-by: Patrick Li <patrick8289@gmail.com>
2025-07-25 18:12:31 -07:00
Yong Hoon Shin
2eddd437ba
Add interleaved RoPE test for Llama4 (Maverick) ( #21478 )
...
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-07-25 17:07:26 -07:00
bigshanedogg
29c6fbe58c
[MODEL] New model support for naver-hyperclovax/HyperCLOVAX-SEED-Vision-Instruct-3B ( #20931 )
...
Signed-off-by: bigshanedogg <bigshane319@gmail.com>
2025-07-25 06:05:42 -07:00
Harry Mellor
13abd0eaf9
[Model] Officially support Emu3 with Transformers backend ( #21319 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 03:22:12 -07:00
Ming Yang
772ce5af97
[Misc] Add dummy maverick test to CI ( #21324 )
...
Signed-off-by: Ming Yang <minos.future@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-23 20:22:42 -07:00
Christian Pinto
8560a5b258
[Core][Model] PrithviMAE Enablement on vLLM v1 engine ( #20577 )
...
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
2025-07-23 11:00:23 -07:00
Asher
2671334d45
[Model] add Hunyuan V1 Dense Model support. ( #21368 )
...
Signed-off-by: Asher Zhang <asherszhang@tencent.com>
2025-07-23 03:54:08 -07:00
Isotr0py
4ecedd1806
[Bugfix] Fix nightly transformers CI failure ( #21427 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-07-23 00:01:01 -07:00
Harry Mellor
f154bb9ff0
Simplify weight loading in Transformers backend ( #21382 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-22 20:29:43 -07:00
Cyrus Leung
35366ae57c
[CI/Build] Fix test failure due to updated model repo ( #21375 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-22 08:39:35 -07:00
Aritra Roy Gosthipaty
2226d5bd85
[Bugfix] Decode Tokenized IDs to Strings for hf_processor in llm.chat() with model_impl=transformers ( #21353 )
...
Signed-off-by: ariG23498 <aritra.born2fly@gmail.com>
2025-07-22 08:27:28 -07:00
Raushan Turganbay
f38ee34a0a
[feat] Enable mm caching for transformers backend ( #21358 )
...
Signed-off-by: raushan <raushan@huggingface.co>
2025-07-22 08:18:46 -07:00
Ning Xie
0df4d9b06b
[Misc] unify variable for LLM instance v2 ( #21356 )
...
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-07-22 06:32:36 -07:00
Raghav Ravishankar
82b8027be6
Add arcee model ( #21296 )
...
Signed-off-by: alyosha-swamy <raghav@arcee.ai>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-22 00:57:43 -07:00
Ming Yang
6ece16c4fe
[Misc] Add dummy maverick test ( #21199 )
...
Signed-off-by: Ming Yang <minos.future@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-21 09:08:09 -07:00