1433 Commits

Author SHA1 Message Date
Gregory Shtrasberg
6a50eaa0d3
[DOCs] Update ROCm installation docs section (#24691)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-09-11 20:02:53 -07:00
Harry Mellor
361ae27f8a
[Docs] Fix formatting of transcription doc (#24676)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-11 11:18:06 -07:00
Harry Mellor
51d41265ad
[Docs] Fix typos in EP deployment doc (#24669)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-11 09:07:23 -07:00
Wentao Ye
4984a291d5
[Doc] Fix Markdown Pre-commit Error (#24670)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-09-11 09:05:59 -07:00
Nicolò Lucchesi
404c85ca72
[Docs] Add transcription support to model (#24664)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-11 07:39:01 -07:00
youkaichao
f510715882
[build] add torch to tool.uv no-build-isolation-package (#24303)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-11 13:19:44 +00:00
Tao He
f946197473
[Docs] Fixes a typo in the qwen3next model name. (#24654)
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
2025-09-11 19:35:14 +08:00
Michael Yao
d14c4ebf08
[Docs] Use 1-2-3 list for deploy steps in deployment/frameworks/ (#24633)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-11 01:50:12 -07:00
Russell Bryant
ba6011027d
[Docs] Update V1 doc to reflect whisper support (#24606)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-09-11 01:50:08 -07:00
Michael Yao
85df8afdae
[Docs] Revise frameworks/anything-llm.md (#24489)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-11 01:50:05 -07:00
Tao He
e93f4cc9e3
Add the support for the qwen3 next model (a hybrid attention model). (#24526)
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-11 15:32:09 +08:00
TaehyunKim
9bd831f501
[Model] New model support for Motif-1-Tiny (#23414)
Signed-off-by: ca1207 <ca1207zzz@gmail.com>
Signed-off-by: TaehyunKim <73943231+ca1207@users.noreply.github.com>
Co-authored-by: WyldeCat <skan1543@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-10 23:29:40 -07:00
youkaichao
8c5a747246
[distributed] update known issues (#24624)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-09-11 11:09:38 +08:00
Robin
36cacd0958
[Doc] Add documentation for GLM-4.5 series models: tool-calling and reasoning parser (#24589)
Signed-off-by: WangErXiao <863579016@qq.com>
2025-09-10 07:50:55 -07:00
Yash Pratap Singh
9e3c3a7df2
[LoRA]: Add LoRA support to Mistral's Voxtral models (#24517)
Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-09-10 06:12:03 -07:00
Tyler Michael Smith
8b83b93739
[Docs] Document the extra memory footprint overhead when using EPLB (#24537)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-09-10 06:09:49 -07:00
Harry Mellor
9dbefd88e9
[Docs] Improve organisation of API Reference nav (#24569)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-10 06:08:21 -07:00
Harry Mellor
e40827280b
[Docs] Enable relative links in examples to function when rendered in the docs (#24041)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-09 21:40:45 -07:00
Nicolò Lucchesi
3707cb2505
[Docs] Gemma3n transcriptions endpoint support (#24512)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-09 11:03:32 -07:00
Didier Durand
46876dff32
[Doc]: fixing typos to improve docs (#24480)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-09-08 23:06:04 -07:00
Mickaël Seznec
ed16d0f26f
[Doc] mention fpdb for multiprocess breakpoints (#24452)
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
2025-09-08 21:46:45 -07:00
Cyrus Leung
948dd3443b
[Bugfix] Fix Apertus HF repo name (#24447)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-08 21:40:29 -07:00
rongfu.leng
c44797a4d6
[Docs]add eplb_config param use docs (#24213)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-09-08 09:36:57 -07:00
Didier Durand
55be93baf5
[Doc]: fix 2 hyperlinks leading to Ray site after they changed Ray's doc structure (#24438)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-08 09:36:54 -07:00
Harry Mellor
717fc00e98
[Docs] Move feature compatibility tables to README (#24431)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-08 06:45:14 -07:00
Chenheli Hua
01dfb5e982
[Frontend] User-provided uuids for medias in chat. (RFC #22044) (#23449)
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Chenheli Hua <huachenheli@outlook.com>
Signed-off-by: Roger Wang <hey@rogerw.me>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.me>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-09-08 06:42:20 -07:00
Michael Yao
c2a8b08fcd
[Doc] Fix issues in integrations/llamastack.md (#24428)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-08 02:28:32 -07:00
Michael Yao
2f0b833a05
[Docs] Fix a tip indentation and typo (#24419)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-08 00:19:40 -07:00
Al-Ekram Elahee Hridoy
8c892b1831
[Doc] Fix UTF-8 encoding issues in documentation generation on Windows (#24361)
Signed-off-by: alekramelaheehridoy <aliqramalaheehridoy@gmail.com>
Signed-off-by: alekramelaheehridoy <alekramelaheehridoy@gmail.com>
Co-authored-by: alekramelaheehridoy <alekramelaheehridoy@gmail.com>
2025-09-07 22:33:52 -07:00
Yan Ma
67841317d1
[xpu] upgrade ipex/python3.12 for xpu (#23830)
Signed-off-by: Yan Ma <yan.ma@intel.com>
2025-09-08 02:07:16 +00:00
wang.yuqi
6d6c6b05d3
[New Model]: google/embeddinggemma-300m (#24318)
Signed-off-by: wang.yuqi <noooop@126.com>
2025-09-05 22:58:36 -07:00
Didier Durand
35bf193864
[Doc]: fix typos in Python comments (#24294)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-09-05 19:41:12 -07:00
youkaichao
7812bcf278
[docs] add shenzhen meetup (#24326)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-09-05 22:48:42 +08:00
Louie Tsai
006e7a34ae
Adding int4 and int8 models for CPU benchmarking (#23709)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2025-09-05 20:08:50 +08:00
Yash Pratap Singh
c9f7081f9c
[LoRA]: Add lora support to qwen-2.5-omni (#24231) 2025-09-04 05:50:50 -07:00
Jiangyun Zhu
eafa8dcde6
[Model] Add pp support for hunyuan (#24212)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-09-04 03:58:26 -07:00
TJian
6c7af8110a
[Doc] Update vLLM Singapore Meetup info (#24234)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-09-04 02:58:18 -07:00
bingchen-mi
e7fc70016f
[Model] Add MiDashengLM model support (#23652)
Signed-off-by: chenbing8 <chenbing8@xiaomi.com>
Signed-off-by: bingchen-mi <chenbing8@xiaomi.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-04 00:08:09 -07:00
bnellnm
e9b92dcd89
[Kernels] Overlap shared experts with send/recv (#23273)
Signed-off-by: Bill Nell <bnell@redhat.com>
2025-09-03 12:35:18 -04:00
nopperl
fa4311d85f
[V1] v1 engine + full CUDA graph support for PLaMo2 (#23998)
Signed-off-by: Hemmi Shinichi <shemmi@preferred.jp>
Signed-off-by: nopperl <54780682+nopperl@users.noreply.github.com>
Co-authored-by: Hemmi Shinichi <shemmi@preferred.jp>
Co-authored-by: Thomas Parnell <tom.parnell@gmail.com>
2025-09-03 08:24:02 -07:00
youkaichao
f38035c123
[distributed][rl] remove nccl cumem env var override (#24141)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-09-03 06:45:25 +00:00
co63oc
1bd007f234
fix some typos (#24071)
Signed-off-by: co63oc <co63oc@users.noreply.github.com>
2025-09-02 20:44:50 -07:00
Peter Pan
0e1759cd54
[docs] add SYS_NICE cap & security-opt for docker/k8s (#24017)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
Signed-off-by: Peter Pan <peter.pan@daocloud.io>
Co-authored-by: Li, Jiang <bigpyj64@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-02 17:27:20 +00:00
Christian Berge
8bd5844989
correct LWS deployment yaml (#23104)
Signed-off-by: cberge908 <42270330+cberge908@users.noreply.github.com>
2025-09-02 12:04:59 +00:00
WeiQing Chen
2f0bab3f26
[Model] Support dp on ViT on GLM-4.5V (#23168)
Signed-off-by: David Chen <530634352@qq.com>
2025-09-02 10:48:18 +00:00
WeiQing Chen
a0e0efd6bd
[Model] Support DP for ViT on Kimi-VL-A3B-Thinking-2506 (#23817)
Signed-off-by: Junhong <liujunhong11@huawei.com>
Signed-off-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Junhong <liujunhong11@huawei.com>
Co-authored-by: LJH-LBJ <98734602+LJH-LBJ@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-09-01 16:56:56 +00:00
Christian Pinto
cf91a89dd2
[docs][misc] IOProcessor plugins fixes (#24046)
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
2025-09-01 09:17:41 -07:00
Julien Debache
41c80698b3
Document multi-proc method selection for profiling (#23802)
Signed-off-by: jdebache <jdebache@nvidia.com>
2025-09-01 06:28:26 -07:00
Kwai-Keye
7c8271cd1e
[Model]: support KeyeVL-1_5-8B (#23838)
Signed-off-by: wangruitao <wangruitao@kuaishou.com>
Co-authored-by: wangruitao <wangruitao@kuaishou.com>
2025-09-01 03:50:27 -07:00
Kay Yan
3e330fcb21
[Doc]: Fix CPU install docs: force torch-backend=cpu to avoid GPU torchvision errors (#24033)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-09-01 03:34:52 -07:00