Elizabeth Thomas
41b6f9200f
Remove all2all backend envvar ( #30363 )
...
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-18 19:46:28 +00:00
wzyrrr
326e7c3105
[Doc] Add Sophgo TPU Support ( #30949 )
...
Co-authored-by: zhaoyang.wang <zhaoyang.wang@sophgo.com>
2025-12-18 16:29:33 +00:00
sarathc-cerebras
28d15ab56b
adds jais 2 support ( #30188 )
...
Signed-off-by: sarathc-cerebras <sarath.chandran@cerebras.net>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-12-18 15:46:58 +00:00
Li, Jiang
cfb7e55515
[Doc][CPU] Update CPU doc ( #30765 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Signed-off-by: Li, Jiang <bigpyj64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-18 04:59:09 +00:00
Xunzhuo
e3a0f21e6c
[docs]: add ecosystem projects sr in docs/governance ( #30844 )
...
Signed-off-by: bitliu <bitliu@tencent.com>
2025-12-17 18:45:56 +00:00
rongfu.leng
9e67c4ce98
[Docs] fix function name ( #30748 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-12-17 12:14:45 +00:00
Andrew Xia
4c054d89aa
[Doc][ResponsesAPI] add documentation ( #30840 )
...
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
2025-12-17 01:53:02 -08:00
Amr Mahdi
ff21a0fc85
[docker] Restructure Dockerfile for more efficient and cache-friendly builds ( #30626 )
...
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
2025-12-15 18:52:19 -08:00
Fadi Arafeh
b2191abdca
[docs][fix] Update Arm CPU vLLM wheel installation docs ( #30594 )
...
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
2025-12-15 19:46:25 +00:00
Chauncey
2a1776b7ac
[Refactor] [2/N] Move tool parsers into the vLLM main directory ( #30675 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-12-15 12:54:52 +00:00
汪志鹏
1adeb3b84c
[New Model] BAGEL support (AR only) ( #28439 )
...
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-15 14:58:23 +08:00
Lasha Koroshinadze
3a20450d31
Add AudioFlamingo3 model support ( #30539 )
...
Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com>
Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-14 02:14:55 -08:00
Didier Durand
1a55cfafcb
[Doc]: fixing typos in various files ( #30540 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Didier Durand <2927957+didier-durand@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-12-14 02:14:37 -08:00
Qidong Su
24429d5924
[Doc] Add instructions for building docker image on GB300 with CUDA13 ( #30414 )
...
Signed-off-by: Qidong Su <soodoshll@gmail.com>
2025-12-13 21:56:53 +00:00
Isotr0py
7c16f3fbcc
[Doc] Add documents for multi-node distributed serving with MP backend ( #30509 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-13 18:02:29 +00:00
lif
ddbfbe5278
[Docs] Clarify Expert Parallel behavior for attention and MoE layers ( #30615 )
...
Signed-off-by: majiayu000 <1835304752@qq.com>
2025-12-13 08:37:59 -09:00
Matthew Bonanni
f5dfbbd8e9
[Docs] Remove references to VLLM_ATTENTION_BACKEND ( #30564 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-12-13 10:20:15 +08:00
Michael Goin
fc0119425c
Add IBM and Red Hat to compute resources sponsors ( #30581 )
...
Signed-off-by: Michael Goin <mgoin64@gmail.com>
2025-12-13 01:34:23 +00:00
ioana ghiban
3efdc3feae
[Docs][CPU backend] Add pre-built Arm CPU Docker images ( #30491 )
...
Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>
2025-12-11 22:03:29 +00:00
Harry Mellor
93db3256a4
Give pooling examples better names ( #30488 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-11 16:22:58 +00:00
ioana ghiban
17cb540248
[Docs][CPU Backend] Add nightly and per revision pre-built Arm CPU wheels ( #30402 )
...
Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-11 15:57:10 +00:00
wang.yuqi
a5f9fb5960
[Deprecation] Deprecation --convert reward, use --convert embed instead. ( #30463 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-12-11 10:18:25 +00:00
xyDong0223
1a516557e1
[Doc] Add Baidu Kunlun XPU support ( #30455 )
...
Signed-off-by: xyDong0223 <dongxinyu23@gmail.com>
2025-12-11 04:52:17 +00:00
Cyrus Leung
5a87d8b9b1
[Deprecation] Remove deprecated plugin and compilation fields for v0.13 release ( #30396 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-10 19:59:35 -08:00
Xu Song
25221b44bb
Add more docs for regex ( #30106 )
...
Signed-off-by: Xu Song <xusong.vip@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-11 00:12:21 +00:00
Seiji Eicher
b9e0951f96
[docs] Improve wide-EP performance + benchmarking documentation ( #27933 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
2025-12-10 22:15:54 +00:00
Michael Goin
fcb894222f
[Docs] Update EPLB docs ( #30426 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-12-10 11:56:51 -09:00
Matthew Bonanni
794a7875ee
[Misc] Consistent case for vllm bench serve results ( #30403 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-12-10 09:44:02 -08:00
Mark McLoughlin
2dcbac9077
[Docs] Generate full list of metrics in user docs ( #30388 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-10 16:09:34 +00:00
Wilson Wu
3bdd426636
Fix typos in comments across multiple files ( #30345 )
...
Signed-off-by: Wilson Wu <iwilsonwu@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-12-09 20:05:28 -08:00
Benjamin Chislett
e858bfe051
[Cleanup] Refactor profiling env vars into a CLI config ( #29912 )
...
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-09 13:29:33 -05:00
Hubert de La Jonquiere
c72ea10723
[Structured Output][Reasoning] Improves decoding throughput for models using single-token reasoning endings. ( #30056 )
2025-12-09 18:54:08 +08:00
Fanli Lin
c2e1987a6e
[Doc] update Intel GPU MM status in Feature x Hardware matrix ( #30294 )
...
Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2025-12-09 05:16:44 +00:00
Or Ozeri
4c6fd25880
kv_transfer: Rename the shared storage connectors ( #30201 )
...
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2025-12-08 20:46:09 -08:00
Ming Yang
60d17251c9
[Disagg] Support large batch size in proxy server and update NixlConnector doc for DP ( #28782 )
...
Signed-off-by: Ming Yang <minos.future@gmail.com>
2025-12-09 00:01:08 +00:00
Simon Mo
77072e93b3
[docs] governance documents ( #24801 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Signed-off-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-12-08 12:06:20 +00:00
wang.yuqi
9e77ffca3f
[Model][7/N] Improve all pooling task | Deprecation as_reward_model. Extract hidden states prefer using new multi-vector retrieval API ( #26686 )
...
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-12-08 08:10:09 +00:00
Zhiyu
cd00c443d2
[Misc] Rename TensorRT Model Optimizer to Model Optimizer ( #30091 )
...
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
2025-12-08 07:05:27 +00:00
Isotr0py
b952f4d3c3
[v1] Add PrefixLM support to FlexAttention backend ( #27938 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-07 15:51:36 +00:00
Cyrus Leung
e83b7e379c
Revert "[Renderer] Separate out RendererConfig from ModelConfig ( #30145 )" ( #30199 )
2025-12-07 00:00:22 -08:00
Cyrus Leung
27f4c2fd46
[Renderer] Separate out RendererConfig from ModelConfig ( #30145 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-12-06 23:15:42 -08:00
jeremyteboul
dce6d229f7
Support multiple image/audio embeddings per requests ( #29988 )
...
Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com>
Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>
2025-12-07 04:34:24 +00:00
Viacheslav
21bb323542
Gigachat 3 tool parser and tests ( #29905 )
...
Signed-off-by: Viacheslav Barinov <viacheslav.teh@gmail.com>
2025-12-06 12:04:14 +00:00
redwrasse
6476382384
prefix caching design doc sha256 now default ( #29261 )
...
Signed-off-by: redwrasse <mail@redwrasse.io>
2025-12-06 07:39:56 +00:00
Russell Bryant
3633035a3f
[Misc] Rename CohereForAI references to CohereLabs ( #30147 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-12-05 19:41:40 +00:00
Yanan Cao
62b3333448
[Frontend] Remove deprecated -O.xx flag ( #29991 )
...
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
2025-12-05 00:47:22 -08:00
Tiger Xu / Zhonghu Xu
60a66ea2dc
[DOC]: Add kthena to integrations ( #29931 )
...
Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>
2025-12-05 08:11:03 +00:00
Hubert de La Jonquiere
befb59e5b1
[Model] Add Holo2 reasoning parser ( #30048 )
...
Signed-off-by: hdlj-h <hubert@hcompany.ai>
2025-12-05 10:38:45 +08:00
TimWang
690cc3ef20
docs: update metrics design doc to use new vllm:kv_cache_usage_perc ( #30041 )
...
Signed-off-by: Tim <tim.wang03@sap.com>
2025-12-04 23:37:14 +00:00
Tao Yun
6dcb07f676
support qwen3-vl handle requests with embeddings ( #30037 )
...
Signed-off-by: taoyun <1069423820@qq.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-12-04 17:34:06 +00:00