Luciano Martins
|
c2612371ad
|
[Model] Add Gemma3 GGUF multimodal support (#27772)
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-18 08:56:29 -08:00 |
|
Pranav
|
f77bce001a
|
[Model] Add Afmoe architecture implementation (#28332)
Signed-off-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
Signed-off-by: Pranav <veldurthipranav@gmail.com>
Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
|
2025-11-17 15:11:20 -08:00 |
|
Yong Hoon Shin
|
11ac9ddd03
|
Support all interleaved layer types (#28485)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-11-13 08:57:20 +00:00 |
|
Harry Mellor
|
d9ab1ad9d1
|
reasoning_content -> reasoning (#27752)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-11-08 12:15:08 +00:00 |
|
Junhong Liu
|
59b453eaa2
|
Speed up mm processor kwargs per request by spliting dynamic and static kwargs (#26483)
Signed-off-by: Junhong <liujunhong11@huawei.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
|
2025-11-07 07:51:28 +08:00 |
|
Julien Denize
|
7a8375f8a0
|
Add llama 4 scaling support (#28145)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-11-06 18:55:17 +00:00 |
|
Julien Denize
|
a404e2c0f1
|
Patch Mistral Tokenizer (#28146)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-11-06 06:43:16 +00:00 |
|
Isotr0py
|
43ecd0a900
|
[Chore] Clean up deepseek v2/v3 config copy (#28055)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-06 03:46:30 +00:00 |
|
Isotr0py
|
ffb08379d8
|
[Chore] Remove Nemotron-Nano-VL config copy (#28126)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-05 20:06:45 +00:00 |
|
Isotr0py
|
3f5a4b6473
|
[Bugfix] Validate custom logits processor xargs for online serving (#27560)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-11-05 16:53:33 +00:00 |
|
pwschuurman
|
f7d2946e99
|
[Bugfix] Skip gs:// model paths for speculator detection (#27846)
Signed-off-by: Peter Schuurman <psch@google.com>
|
2025-11-03 14:31:03 +00:00 |
|
Cyrus Leung
|
6c317a656e
|
[Misc] Provide Siglip2 chat template (#27939)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-11-02 13:42:38 +00:00 |
|
Julien Denize
|
73444b7b56
|
Performance fix MistralTokenizer: cache special ids and tokens (#27925)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2025-11-02 08:48:33 +00:00 |
|
Zhiyuan Li
|
4e68cc9b6a
|
[Model] Introduce Kimi Linear to vLLM (#27809)
Signed-off-by: lizhiyuan <lizhiyuan@moonshot.cn>
Signed-off-by: Zhiyuan Li <uniartisan2017@gmail.com>
|
2025-10-30 21:02:27 +08:00 |
|
Roger Wang
|
a8d2e326ec
|
[Bugfix][CI] Fix config resolving logic with remote models (#27610)
|
2025-10-28 00:48:32 +00:00 |
|
Yu Jiaqi
|
4f882be4a0
|
[Model] Siglip2 Model Support (#27566)
Signed-off-by: piood <2477084691@qq.com>
|
2025-10-27 06:57:37 -07:00 |
|
rongfu.leng
|
87c41c26ad
|
[Bugfix] Fix processor initialization for model from modelscope instead of HF (#27461)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-26 07:44:31 +00:00 |
|
Yu Jiaqi
|
0552cfb195
|
[Model] Siglip Embedding Support (#27324)
Signed-off-by: piood <2477084691@qq.com>
|
2025-10-23 20:19:48 +00:00 |
|
tomeras91
|
61089465a6
|
[Model] Add MoE support for NemotronH (#25863)
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
|
2025-10-23 10:27:23 +00:00 |
|
Isotr0py
|
2566dca2a9
|
[Bugfix] Fix deepseek-ocr multi-image inference and add merge_by_field_config=True with tensor schema support (#27361)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-22 17:15:38 -07:00 |
|
Isotr0py
|
675aa2ec64
|
[Model] Upstream Deepseek-OCR model (#27247)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2025-10-22 07:59:15 -07:00 |
|
Nick Hill
|
647214f3d5
|
[V0 Deprecation] Remove V0 executors (#27142)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-10-21 11:09:37 -07:00 |
|
Cyrus Leung
|
d31f7844f8
|
[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-19 05:20:55 -07:00 |
|
Lukas Geiger
|
4d055ef465
|
Remove unused imports (#26972)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-10-16 19:51:17 -07:00 |
|
Cyrus Leung
|
4d4d6bad19
|
[Chore] Separate out vllm.utils.importlib (#27022)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-17 00:48:59 +00:00 |
|
Cyrus Leung
|
f6cdc9a02f
|
[Chore] Rename utils submodules (#26920)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-16 03:58:13 +00:00 |
|
Cyrus Leung
|
136a17fe6e
|
[Chore] Separate out vllm.utils.func (#26904)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-15 13:03:58 +00:00 |
|
Jialin Ouyang
|
07ca70af8d
|
[Core][Easy] Use envs.__getattr__ for all Unify to environment variable access (#26810)
Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>
|
2025-10-15 01:41:18 +00:00 |
|
wang.yuqi
|
767c3ab869
|
[Model][0/N] Improve all pooling task | clean up (#25817)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-13 16:44:50 +08:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Harry Mellor
|
7c12763b24
|
Fix some typing issues found by mypy==1.18.2 (#26596)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-10 18:21:25 +00:00 |
|
Shane A
|
8d2b8c0ff2
|
[Model] Add FlexOlmo model implementation (#24923)
Signed-off-by: Shane A <shanea@allenai.org>
|
2025-10-10 09:43:15 -07:00 |
|
Julien Denize
|
c6187f55f7
|
Refactor MistralTokenizer (#26358)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-10-09 22:48:58 +00:00 |
|
Wentao Ye
|
f8607863d8
|
[Feature] Enable E8M0 by Default on Hopper for DeepGEMM, 5% E2E throughput improvement (#26197)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-08 15:33:56 +08:00 |
|
Paul Pak
|
320feae6f5
|
[Model] Lfm2Moe (#26344)
Signed-off-by: Paul Pak <paulpak58@gmail.com>
|
2025-10-07 16:03:05 +00:00 |
|
Isotr0py
|
08d26a1b7e
|
[Model] Use merge_by_field_config for MM models (Ovis family) (#26308)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-07 12:54:22 +00:00 |
|
Rahul Tuli
|
05f6846ede
|
Support llama3 eagle3 head with llama4 verifier (#25961)
Signed-off-by: rahul-tuli <rtuli@redhat.com>
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
|
2025-10-06 13:56:08 -04:00 |
|
Harry Mellor
|
4e256cadc2
|
Remove all references to yapf as it's no longer used (#26251)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 09:18:11 -07:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Cyrus Leung
|
4570535ec4
|
[Model] CLIP Embedding Support (#26010)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-04 06:21:42 -07:00 |
|
Wentao Ye
|
767cbb011d
|
[CI] Fix Pre-commit Mypy Error (#26181)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-10-03 16:08:03 -07:00 |
|
Bowen Bao
|
78b8015a4d
|
[Bugfix] Relax tokenizer regex for mixtral to include 'tokenizer.model' (#25964)
Signed-off-by: Bowen Bao <bowenbao@amd.com>
|
2025-10-03 18:31:59 -04:00 |
|
Yongye Zhu
|
fa7e254a7f
|
[New Model] DeepSeek-V3.2 (Rebased to Main) (#25896)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Yongye Zhu <zyy1102000@gmail.com>
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Lucia Fang <fanglu@meta.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Lucia Fang <116399278+luccafong@users.noreply.github.com>
Co-authored-by: Lucia Fang <fanglu@meta.com>
Co-authored-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Siyuan Fu <siyuanf@nvidia.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Xiaozhu Meng <mxz297@gmail.com>
Co-authored-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
|
2025-09-30 17:14:41 +08:00 |
|
acisseJZhong
|
e47433b3c1
|
[BugFix] Pass config_format via try_get_generation_config (#25912)
|
2025-09-30 05:09:50 +00:00 |
|
Isotr0py
|
d4d9899860
|
[Quantization] Add field to skip unquantized modules for GPTQ config (#25455)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-26 15:47:41 +00:00 |
|
Isotr0py
|
03858e6d1c
|
[Bugfix] Fix InternS1 video processing after Transformers v4.56 (#25644)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-09-25 14:46:04 +00:00 |
|
Harry Mellor
|
8c853050e7
|
[Docs] Enable fail_on_warning for the docs build in CI (#25580)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-09-24 19:30:33 +00:00 |
|
rongfu.leng
|
2dda3e35d0
|
[Bugfix] add cache model when from object storage get model (#24764)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-09-24 18:11:16 +00:00 |
|
ahao-anyscale
|
c8bde93367
|
[BUG] Allows for RunAI Streamer and Torch.compile cache to be used together (#24922)
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
|
2025-09-23 18:13:32 -06:00 |
|
Roger Wang
|
7b57a433da
|
[Model] Support Dots OCR (#24645)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: yinz-aizip <yinz@aizip.ai>
|
2025-09-22 02:24:40 +00:00 |
|