804 Commits

Author SHA1 Message Date
Isotr0py
0ff05e3770
[Bugfix] Fix encoder-only model support for transformers backend (#28021)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-04 22:24:41 -08:00
yt0428
05cae69f0f
[model] Add support for openPangu_Ultra_MoE (#27521)
Signed-off-by: yuantao <2422264527@qq.com>
Signed-off-by: yt0428 <51468697+yt0428@users.noreply.github.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-04 08:17:20 -08:00
zhang-prog
40b69e33e7
[Model] Add PaddleOCR-VL Model Support (#27758)
Signed-off-by: zhangyue <zhangyue66@baidu.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: zhangyue66 <zhangyue66@baidu.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-03 19:04:22 +08:00
Asaf Joseph Gardin
00b31a36a2
[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377)
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
2025-11-02 04:16:23 -08:00
TJian
e2347dbf58
[Bugfix] [Model] Missing MRoPE function definition from KeyeForConditionalGeneration (#27895)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-11-01 13:45:23 +08:00
Cyrus Leung
879a06579e
[CI/Build] Bump transformers version (#27528)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-31 22:11:07 -07:00
Fan Yin
9956aae4ea
[Model][Ouro] Support Ouro Model (#27794)
Signed-off-by: yinfan.1024 <yinfan.1024@bytedance.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: yinfan.1024 <yinfan.1024@bytedance.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-10-30 22:34:41 +08:00
Zhiyuan Li
4e68cc9b6a
[Model] Introduce Kimi Linear to vLLM (#27809)
Signed-off-by: lizhiyuan <lizhiyuan@moonshot.cn>
Signed-off-by: Zhiyuan Li <uniartisan2017@gmail.com>
2025-10-30 21:02:27 +08:00
wang.yuqi
4464723f22
[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524)
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-30 12:13:05 +00:00
Isotr0py
ad3ec89532
[VLM] Add Qwen3-VL generation test (#25185)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-29 12:19:37 +00:00
Zhewen Li
83fd49b1fc
[CI/Build][Bugfix]Fix Quantized Models Test on AMD (#27712)
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-10-29 06:27:30 +00:00
Yu Jiaqi
4f882be4a0
[Model] Siglip2 Model Support (#27566)
Signed-off-by: piood <2477084691@qq.com>
2025-10-27 06:57:37 -07:00
Jee Jee Li
2d631d28c6
[Doc] Slight improvement to M2 and beyond (#27554)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-27 09:02:10 +00:00
youkaichao
361a7463d3
fix m2 test (#27536)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-10-27 01:04:36 +08:00
Roger Young
720af6ab79
[Model][MiniMax-M2] Support MiniMax-M2 Model (#27535)
Signed-off-by: xuebi <xuebi@minimaxi.com>
Co-authored-by: xuebi <xuebi@minimaxi.com>
2025-10-27 00:59:11 +08:00
Cyrus Leung
66a168a197
[CI/Build] Refactor processing tests (#27470)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-25 16:14:30 +00:00
Yu Jiaqi
0552cfb195
[Model] Siglip Embedding Support (#27324)
Signed-off-by: piood <2477084691@qq.com>
2025-10-23 20:19:48 +00:00
wang.yuqi
3fa2c12185
[Frontend][4/N] Improve all pooling task | Add plugin pooling task (#26973)
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Christian Pinto <christian.pinto@ibm.com>
2025-10-23 14:46:18 +00:00
Cyrus Leung
fe2016de2d
[CI/Build] Remove unnecessary flags from test registry (#27353)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-23 14:42:40 +00:00
wang.yuqi
3729ed00ba
[Model] Add num_cached_tokens for PoolingRequestOutput (#27378)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-10-23 14:03:42 +08:00
Isotr0py
2566dca2a9
[Bugfix] Fix deepseek-ocr multi-image inference and add merge_by_field_config=True with tensor schema support (#27361)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-22 17:15:38 -07:00
dongbo910220
a0003b56b0
[Chore] Separate out system utilities from vllm.utils (#27201)
Signed-off-by: dongbo910220 <1275604947@qq.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-22 20:25:25 +00:00
Luciano Martins
e05a6754a8
[Model] Revert PR #26715: Restore custom PaliGemma and Gemma3-MM impl… (#27309)
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
2025-10-22 10:05:34 -07:00
Russell Bryant
58fab50d82
[Frontend] Require flag for loading text and image embeds (#27204)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-22 15:52:02 +00:00
Isotr0py
675aa2ec64
[Model] Upstream Deepseek-OCR model (#27247)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-22 07:59:15 -07:00
dongbo910220
3ae082c373
[Chore] Separate out optional dependency checks from vllm.utils (#27207)
Signed-off-by: dongbo910220 <1275604947@qq.com>
Signed-off-by: dongbo910220 <32610838+dongbo910220@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-22 10:44:21 -04:00
wangxiyuan
f6027b2855
[1/N][Platform] Cleanup useless function (#26982)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-10-22 09:04:57 +00:00
Yi Zhang
f32bf7582e
[Model][VLM] Support Bee-8B Model (#27012)
Signed-off-by: uyzhang <yi.zhang.4096@gmail.com>
Signed-off-by: Yi Zhang <zhangyi970819@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-20 02:31:26 +00:00
Cyrus Leung
d31f7844f8
[Misc] Move utils to avoid conflicts with stdlib, and move tests (#27169)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-19 05:20:55 -07:00
iAmir97
7a6c8c3fa1
[Chore] Separate out vllm.utils.network_utils (#27164)
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
2025-10-19 03:06:32 -07:00
Isotr0py
6ac5e06f7c
[Chore] Clean up pytorch helper functions in vllm.utils (#26908)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com>
2025-10-18 09:48:22 -07:00
iAmir97
1d165d6d85
[Chore] Separate out vllm.utils.mem_utils (#27143)
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Signed-off-by: iAmir97 <71513472+iAmir97@users.noreply.github.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-18 10:06:59 +00:00
Harry Mellor
6c9fdbf725
[Docs] Replace rst style double-backtick with md single-backtick (#27091)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:47:34 -07:00
Said Taghadouini
3aeb19a39e
[Model] Add support for LightOnOCR (#26916)
Signed-off-by: Said Taghadouini <taghadouinisaid@gmail.com>
Signed-off-by: Said Taghadouini <84044788+staghado@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-17 05:05:24 +00:00
Cyrus Leung
8c017b3490
[Model] Always use Transformers backend for PaliGemma and Gemma3-MM (#26715)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-17 05:03:35 +00:00
Harry Mellor
fb5e10d3fb
Refactor Transformers backend to use mixins (#26906)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-16 21:50:39 +00:00
Tahsin Tunan
43721bc67f
[CI] Replace large models with tiny alternatives in tests (#24057)
Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-16 15:51:27 +01:00
Cyrus Leung
d2740fafbf
[Chore] Separate out vllm.utils.collections (#26990)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-16 08:35:35 +00:00
Cyrus Leung
f6cdc9a02f
[Chore] Rename utils submodules (#26920)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-16 03:58:13 +00:00
Cyrus Leung
136a17fe6e
[Chore] Separate out vllm.utils.func (#26904)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-15 13:03:58 +00:00
wangxiyuan
8f4b313c37
[Misc] rename torch_dtype to dtype (#26695)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-10-15 12:11:48 +00:00
wang.yuqi
f54f85129e
[Model][2/N] Improve all pooling task | Support multi-vector retrieval (#25370)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-10-15 11:14:41 +00:00
Jaya Yuan
ea97940d6c
[DCP] Support Decode Context Parallel (DCP) for GQA with FlashAttention (#24864)
Signed-off-by: yuanyongjie.yyj <yuanyongjie.yyj@antgroup.com>
Signed-off-by: FENP <32334296+FENP@users.noreply.github.com>
Signed-off-by: Jaya Yuan <yuanyongjie.yyj@antgroup.com>
2025-10-14 13:07:50 +00:00
Cyrus Leung
9c4cb68339
[Chore] Remove SupportsV0Only interface and update supported models docs (#26783)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-14 04:55:10 -07:00
Cyrus Leung
d1d063a588
[Chore] Use max_transformers_version for Qwen-VL test (#26792)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-14 03:03:46 -07:00
Isotr0py
8e67b2557a
[Bugfix] Fix out of bound index issue for Jina-embedding-v3 RoPE with cuda graph (#26687)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-13 03:21:48 -07:00
wang.yuqi
767c3ab869
[Model][0/N] Improve all pooling task | clean up (#25817)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-10-13 16:44:50 +08:00
gjgjos
18ed7746ea
[Feature] Add support for naver/splade-v3 (BERT-based sparse embedding model) (#26339)
Signed-off-by: gjgjos <gjgjos@naver.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-10-12 17:00:52 +00:00
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Isotr0py
045b396d09
[Bugfix][CI/Build] Fix failing Mteb CI (#26638)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-12 02:42:42 -07:00