Russell Bryant
ba6011027d
[Docs] Update V1 doc to reflect whisper support ( #24606 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-09-11 01:50:08 -07:00
Michael Yao
85df8afdae
[Docs] Revise frameworks/anything-llm.md ( #24489 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-11 01:50:05 -07:00
Tao He
e93f4cc9e3
Add the support for the qwen3 next model (a hybrid attention model). ( #24526 )
...
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-11 15:32:09 +08:00
TaehyunKim
9bd831f501
[Model] New model support for Motif-1-Tiny ( #23414 )
...
Signed-off-by: ca1207 <ca1207zzz@gmail.com>
Signed-off-by: TaehyunKim <73943231+ca1207@users.noreply.github.com>
Co-authored-by: WyldeCat <skan1543@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-10 23:29:40 -07:00
youkaichao
8c5a747246
[distributed] update known issues ( #24624 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-09-11 11:09:38 +08:00
Robin
36cacd0958
[Doc] Add documentation for GLM-4.5 series models: tool-calling and reasoning parser ( #24589 )
...
Signed-off-by: WangErXiao <863579016@qq.com>
2025-09-10 07:50:55 -07:00
Yash Pratap Singh
9e3c3a7df2
[LoRA]: Add LoRA support to Mistral's Voxtral models ( #24517 )
...
Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-10 06:12:03 -07:00
Tyler Michael Smith
8b83b93739
[Docs] Document the extra memory footprint overhead when using EPLB ( #24537 )
...
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-09-10 06:09:49 -07:00
Harry Mellor
9dbefd88e9
[Docs] Improve organisation of API Reference nav ( #24569 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-10 06:08:21 -07:00
Harry Mellor
e40827280b
[Docs] Enable relative links in examples to function when rendered in the docs ( #24041 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-09 21:40:45 -07:00
Nicolò Lucchesi
3707cb2505
[Docs] Gemma3n transcriptions endpoint support ( #24512 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-09 11:03:32 -07:00
Didier Durand
46876dff32
[Doc]: fixing typos to improve docs ( #24480 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-09-08 23:06:04 -07:00
Mickaël Seznec
ed16d0f26f
[Doc] mention fpdb for multiprocess breakpoints ( #24452 )
...
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
2025-09-08 21:46:45 -07:00
Cyrus Leung
948dd3443b
[Bugfix] Fix Apertus HF repo name ( #24447 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-08 21:40:29 -07:00
rongfu.leng
c44797a4d6
[Docs]add eplb_config param use docs ( #24213 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-09-08 09:36:57 -07:00
Didier Durand
55be93baf5
[Doc]: fix 2 hyperlinks leading to Ray site after they changed Ray's doc structure ( #24438 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-08 09:36:54 -07:00
Harry Mellor
717fc00e98
[Docs] Move feature compatibility tables to README ( #24431 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-08 06:45:14 -07:00
Chenheli Hua
01dfb5e982
[Frontend] User-provided uuids for medias in chat. (RFC #22044 ) ( #23449 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-09-08 06:42:20 -07:00
Michael Yao
c2a8b08fcd
[Doc] Fix issues in integrations/llamastack.md ( #24428 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-08 02:28:32 -07:00
Michael Yao
2f0b833a05
[Docs] Fix a tip indentation and typo ( #24419 )
...
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-08 00:19:40 -07:00
Al-Ekram Elahee Hridoy
8c892b1831
[Doc] Fix UTF-8 encoding issues in documentation generation on Windows ( #24361 )
...
Signed-off-by: alekramelaheehridoy <aliqramalaheehridoy@gmail.com>
Signed-off-by: alekramelaheehridoy <alekramelaheehridoy@gmail.com>
Co-authored-by: alekramelaheehridoy <alekramelaheehridoy@gmail.com>
2025-09-07 22:33:52 -07:00
Yan Ma
67841317d1
[xpu] upgrade ipex/python3.12 for xpu ( #23830 )
...
Signed-off-by: Yan Ma <yan.ma@intel.com>
2025-09-08 02:07:16 +00:00
wang.yuqi
6d6c6b05d3
[New Model]: google/embeddinggemma-300m ( #24318 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-09-05 22:58:36 -07:00
Didier Durand
35bf193864
[Doc]: fix typos in Python comments ( #24294 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-09-05 19:41:12 -07:00
youkaichao
7812bcf278
[docs] add shenzhen meetup ( #24326 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-09-05 22:48:42 +08:00
Louie Tsai
006e7a34ae
Adding int4 and int8 models for CPU benchmarking ( #23709 )
...
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2025-09-05 20:08:50 +08:00
Yash Pratap Singh
c9f7081f9c
[LoRA]: Add lora support to qwen-2.5-omni ( #24231 )
2025-09-04 05:50:50 -07:00
Jiangyun Zhu
eafa8dcde6
[Model] Add pp support for hunyuan ( #24212 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-09-04 03:58:26 -07:00
TJian
6c7af8110a
[Doc] Update vLLM Singapore Meetup info ( #24234 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-09-04 02:58:18 -07:00
bingchen-mi
e7fc70016f
[Model] Add MiDashengLM model support ( #23652 )
...
Signed-off-by: chenbing8 <chenbing8@xiaomi.com>
Signed-off-by: bingchen-mi <chenbing8@xiaomi.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-04 00:08:09 -07:00
bnellnm
e9b92dcd89
[Kernels] Overlap shared experts with send/recv ( #23273 )
...
Signed-off-by: Bill Nell <bnell@redhat.com>
2025-09-03 12:35:18 -04:00
nopperl
fa4311d85f
[V1] v1 engine + full CUDA graph support for PLaMo2 ( #23998 )
...
Signed-off-by: Hemmi Shinichi <shemmi@preferred.jp>
Signed-off-by: nopperl <54780682+nopperl@users.noreply.github.com>
Co-authored-by: Hemmi Shinichi <shemmi@preferred.jp>
Co-authored-by: Thomas Parnell <tom.parnell@gmail.com>
2025-09-03 08:24:02 -07:00
youkaichao
f38035c123
[distributed][rl] remove nccl cumem env var override ( #24141 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-03 06:45:25 +00:00
co63oc
1bd007f234
fix some typos ( #24071 )
...
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-09-02 20:44:50 -07:00
Peter Pan
0e1759cd54
[docs] add SYS_NICE cap & security-opt for docker/k8s ( #24017 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Co-authored-by: Li, Jiang <bigpyj64@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-02 17:27:20 +00:00
Christian Berge
8bd5844989
correct LWS deployment yaml ( #23104 )
...
Signed-off-by: cberge908 <42270330+cberge908@users.noreply.github.com>
2025-09-02 12:04:59 +00:00
WeiQing Chen
2f0bab3f26
[Model] Support dp on ViT on GLM-4.5V ( #23168 )
...
Signed-off-by: David Chen <530634352@qq.com>
2025-09-02 10:48:18 +00:00
WeiQing Chen
a0e0efd6bd
[Model] Support DP for ViT on Kimi-VL-A3B-Thinking-2506 ( #23817 )
...
Signed-off-by: Junhong <liujunhong11@huawei.com>
Signed-off-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-09-01 16:56:56 +00:00
Christian Pinto
cf91a89dd2
[docs][misc] IOProcessor plugins fixes ( #24046 )
...
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
2025-09-01 09:17:41 -07:00
Julien Debache
41c80698b3
Document multi-proc method selection for profiling ( #23802 )
...
Signed-off-by: jdebache <jdebache@nvidia.com>
2025-09-01 06:28:26 -07:00
Kwai-Keye
7c8271cd1e
[Model]: support KeyeVL-1_5-8B ( #23838 )
...
Signed-off-by: wangruitao <wangruitao@kuaishou.com>
Co-authored-by: wangruitao <wangruitao@kuaishou.com>
2025-09-01 03:50:27 -07:00
Kay Yan
3e330fcb21
[Doc]: Fix CPU install docs: force torch-backend=cpu to avoid GPU torchvision errors ( #24033 )
...
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-09-01 03:34:52 -07:00
Christian Pinto
1cb39dbcdd
[Misc] IO Processor plugins for pooling models ( #22820 )
...
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
2025-08-31 23:07:12 -07:00
Roger Wang
749be00a98
[Core][Multimodal] Allow passing multi_modal_uuids as multimodal identifiers. ( #23394 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-08-30 18:01:22 -07:00
sadegh.shokatian
379ea2823a
Add LoRA support for DeepSeek models (V2, V3, R1-0528) ( #23971 )
...
Signed-off-by: sadeghja1070 <sadegh.ja1070@gmail.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-08-30 06:40:02 -07:00
Jiangyun Zhu
3a6acad431
[Model] Enable encoder DP for MiniCPM-V ( #23948 )
...
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
Signed-off-by: Jiangyun Zhu <riverclouds.zhu@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-08-30 06:31:26 -07:00
Thomas Parnell
1c26b42296
[Docs] [V1] [Hybrid] Add new documentation re: contributing mamba-based models ( #23824 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-08-29 18:47:58 +00:00
Li, Jiang
ad39106b16
[CPU] Enable data parallel for CPU backend ( #23903 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-08-29 02:19:58 -07:00
Didier Durand
d99c3a4f7b
[Doc]: fix typos in .md files (including those of #23751 ) ( #23825 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-08-28 04:38:19 -07:00
Isotr0py
c5d004aaaf
[Model] Add PP support and VLM backbone compatability for GPT-OSS ( #23680 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-08-28 16:03:28 +08:00