xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-11 03:57:09 +08:00

Author	SHA1	Message	Date
Cyrus Leung	09dc7c690c	[Chore][1/2] Drop `v0.14` deprecations (#31285 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 09:54:01 -08:00
Cyrus Leung	d201807339	[Chore] Bump `lm-eval` version (#31264 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 05:39:13 -08:00
Yuan Tang	0736f901e7	docs: Add llm-d integration to the website (#31234 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-12-23 20:27:22 +00:00
Jakub Zakrzewski	23daef548d	[Frontend] Support using chat template as custom score template for reranking models (#30550 ) Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-23 11:19:16 +00:00
Yan Ma	f1c2c20136	[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation (#30538 ) Signed-off-by: Yan Ma <yan.ma@intel.com>	2025-12-23 05:22:15 +00:00
Michael Goin	6d518ffbaa	[CI Failure] Disable mosaicml/mpt-7b and databricks/dbrx-instruct tests (#31182 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-22 15:40:35 -08:00
Michael Goin	9586354053	[Doc] Add vllm-metal to hardware plugin documentation (#31174 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-22 20:06:29 +00:00
Roger Young	c02a2705f9	Update MiniMax-M2 ToolCall and add MiniMax-M2.1 in Docs (#31083 ) Signed-off-by: xuebi <xuebi@minimaxi.com> Co-authored-by: xuebi <xuebi@minimaxi.com>	2025-12-22 05:28:40 +00:00
CedricHuang	19cc9468fd	[Feature]: Support NVIDIA ModelOpt HF FP8 variants FP8_PER_CHANNEL_PER_TOKEN and FP8_PB_WO in vLLM (#30957 )	2025-12-21 22:34:49 -05:00
Steve Westerhouse	9d701e90d8	[Doc] Clarify FP8 KV cache computation workflow (#31071 ) Signed-off-by: westers <steve.westerhouse@origami-analytics.com>	2025-12-22 08:41:37 +08:00
Yuxuan Zhang	8a7a414374	GLM-4.7 Tool Parser and Doc Update (#30876 ) Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>	2025-12-20 00:09:58 +00:00
Zhonghua Deng	969bbc7c61	[Model] Add MiMo-V2-Flash support (#30836 ) Signed-off-by: Abatom <abzhonghua@gmail.com> Signed-off-by: Jumiar <liuanqim10@126.com> Signed-off-by: Zyann7 <zyann7@outlook.com> Co-authored-by: Jumiar <liuanqim10@126.com> Co-authored-by: Zyann7 <zyann7@outlook.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-19 17:17:03 +00:00
Andrey Talman	268a972c62	Update Pytorch version update docs (#30982 )	2025-12-19 16:08:53 +00:00
Li, Jiang	420ba2dbb6	Enable aarch64 CPU performance benchmarks (#26494 ) Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com> Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com> Co-authored-by: Ioana Ghiban <ioana.ghiban@arm.com> Co-authored-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-12-19 12:16:18 +00:00
Li, Jiang	096b25c9ed	[Doc][CPU] Fix index link for CPU regular release wheels (#31015 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-12-19 07:29:52 +00:00
Elizabeth Thomas	41b6f9200f	Remove all2all backend envvar (#30363 ) Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-18 19:46:28 +00:00
wzyrrr	326e7c3105	[Doc] Add Sophgo TPU Support (#30949 ) Co-authored-by: zhaoyang.wang <zhaoyang.wang@sophgo.com>	2025-12-18 16:29:33 +00:00
sarathc-cerebras	28d15ab56b	adds jais 2 support (#30188 ) Signed-off-by: sarathc-cerebras <sarath.chandran@cerebras.net> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-12-18 15:46:58 +00:00
Li, Jiang	cfb7e55515	[Doc][CPU] Update CPU doc (#30765 ) Signed-off-by: jiang1.li <jiang1.li@intel.com> Signed-off-by: Li, Jiang <bigpyj64@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-18 04:59:09 +00:00
Xunzhuo	e3a0f21e6c	[docs]: add ecosystem projects sr in docs/governance (#30844 ) Signed-off-by: bitliu <bitliu@tencent.com>	2025-12-17 18:45:56 +00:00
rongfu.leng	9e67c4ce98	[Docs] fix function name (#30748 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-12-17 12:14:45 +00:00
Andrew Xia	4c054d89aa	[Doc][ResponsesAPI] add documentation (#30840 ) Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>	2025-12-17 01:53:02 -08:00
Amr Mahdi	ff21a0fc85	[docker] Restructure Dockerfile for more efficient and cache-friendly builds (#30626 ) Signed-off-by: Amr Mahdi <amrmahdi@meta.com>	2025-12-15 18:52:19 -08:00
Fadi Arafeh	b2191abdca	[docs][fix] Update Arm CPU vLLM wheel installation docs (#30594 ) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-12-15 19:46:25 +00:00
Chauncey	2a1776b7ac	[Refactor] [2/N] Move tool parsers into the vLLM main directory (#30675 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-15 12:54:52 +00:00
汪志鹏	1adeb3b84c	[New Model] BAGEL support (AR only) (#28439 ) Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-15 14:58:23 +08:00
Lasha Koroshinadze	3a20450d31	Add AudioFlamingo3 model support (#30539 ) Signed-off-by: Lasha <26011196+lashahub@users.noreply.github.com> Signed-off-by: Lasha Koroshinadze <26011196+lashahub@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-14 02:14:55 -08:00
Didier Durand	1a55cfafcb	[Doc]: fixing typos in various files (#30540 ) Signed-off-by: Didier Durand <durand.didier@gmail.com> Signed-off-by: Didier Durand <2927957+didier-durand@users.noreply.github.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-12-14 02:14:37 -08:00
Qidong Su	24429d5924	[Doc] Add instructions for building docker image on GB300 with CUDA13 (#30414 ) Signed-off-by: Qidong Su <soodoshll@gmail.com>	2025-12-13 21:56:53 +00:00
Isotr0py	7c16f3fbcc	[Doc] Add documents for multi-node distributed serving with MP backend (#30509 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-12-13 18:02:29 +00:00
lif	ddbfbe5278	[Docs] Clarify Expert Parallel behavior for attention and MoE layers (#30615 ) Signed-off-by: majiayu000 <1835304752@qq.com>	2025-12-13 08:37:59 -09:00
Matthew Bonanni	f5dfbbd8e9	[Docs] Remove references to `VLLM_ATTENTION_BACKEND` (#30564 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-12-13 10:20:15 +08:00
Michael Goin	fc0119425c	Add IBM and Red Hat to compute resources sponsors (#30581 ) Signed-off-by: Michael Goin <mgoin64@gmail.com>	2025-12-13 01:34:23 +00:00
ioana ghiban	3efdc3feae	[Docs][CPU backend] Add pre-built Arm CPU Docker images (#30491 ) Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>	2025-12-11 22:03:29 +00:00
Harry Mellor	93db3256a4	Give pooling examples better names (#30488 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-11 16:22:58 +00:00
ioana ghiban	17cb540248	[Docs][CPU Backend] Add nightly and per revision pre-built Arm CPU wheels (#30402 ) Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-11 15:57:10 +00:00
wang.yuqi	a5f9fb5960	[Deprecation] Deprecation `--convert reward`, use `--convert embed` instead. (#30463 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-11 10:18:25 +00:00
xyDong0223	1a516557e1	[Doc] Add Baidu Kunlun XPU support (#30455 ) Signed-off-by: xyDong0223 <dongxinyu23@gmail.com>	2025-12-11 04:52:17 +00:00
Cyrus Leung	5a87d8b9b1	[Deprecation] Remove deprecated plugin and compilation fields for v0.13 release (#30396 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:59:35 -08:00
Xu Song	25221b44bb	Add more docs for regex (#30106 ) Signed-off-by: Xu Song <xusong.vip@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-11 00:12:21 +00:00
Seiji Eicher	b9e0951f96	[docs] Improve wide-EP performance + benchmarking documentation (#27933 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-12-10 22:15:54 +00:00
Michael Goin	fcb894222f	[Docs] Update EPLB docs (#30426 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-10 11:56:51 -09:00
Matthew Bonanni	794a7875ee	[Misc] Consistent case for `vllm bench serve` results (#30403 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-12-10 09:44:02 -08:00
Mark McLoughlin	2dcbac9077	[Docs] Generate full list of metrics in user docs (#30388 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-10 16:09:34 +00:00
Wilson Wu	3bdd426636	Fix typos in comments across multiple files (#30345 ) Signed-off-by: Wilson Wu <iwilsonwu@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-12-09 20:05:28 -08:00
Benjamin Chislett	e858bfe051	[Cleanup] Refactor profiling env vars into a CLI config (#29912 ) Signed-off-by: Benjamin Chislett <bchislett@nvidia.com> Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-09 13:29:33 -05:00
Hubert de La Jonquiere	c72ea10723	[Structured Output][Reasoning] Improves decoding throughput for models using single-token reasoning endings. (#30056 )	2025-12-09 18:54:08 +08:00
Fanli Lin	c2e1987a6e	[Doc] update Intel GPU MM status in Feature x Hardware matrix (#30294 ) Signed-off-by: Lin, Fanli <fanli.lin@intel.com>	2025-12-09 05:16:44 +00:00
Or Ozeri	4c6fd25880	kv_transfer: Rename the shared storage connectors (#30201 ) Signed-off-by: Or Ozeri <oro@il.ibm.com>	2025-12-08 20:46:09 -08:00
Ming Yang	60d17251c9	[Disagg] Support large batch size in proxy server and update NixlConnector doc for DP (#28782 ) Signed-off-by: Ming Yang <minos.future@gmail.com>	2025-12-09 00:01:08 +00:00

1 2 3 4 5 ...

1788 Commits