xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-08 21:27:27 +08:00

Author	SHA1	Message	Date
LiuXiaoxuanPKU	b484b79504	fix	2025-04-01 15:46:41 -07:00
LiuXiaoxuanPKU	8fcd4d18e0	minor	2025-04-01 13:51:04 -07:00
LiuXiaoxuanPKU	50e2788383	dsd draft	2025-04-01 13:33:07 -07:00
shangmingc	239b7befdd	[V1][Spec Decode] Remove deprecated spec decode config params (#15466 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-03-31 09:19:35 -07:00
Cyrus Leung	09e974d483	[Bugfix] Check dimensions of multimodal embeddings in V1 (#15816 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-31 09:01:35 -07:00
Mrm	037bcd942c	[Bugfix] Fix missing return value in load_weights method of adapters.py (#15542 ) Signed-off-by: noc-turne <2270929247@qq.com>	2025-03-31 06:56:42 -07:00
Alex Brooks	c2e7507ad4	[Bugfix] Fix Crashing When Loading Modules With Batchnorm Stats (#15813 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-03-31 13:23:53 +00:00
Naveassaf	3aa2b6a637	[Model] Update support for NemotronNAS models (#15008 ) Signed-off-by: Nave Assaf <nassaf@nvidia.com>	2025-03-31 20:35:14 +08:00
youkaichao	555aa21905	[V1] Fully Transparent Implementation of CPU Offloading (#15354 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-03-31 20:22:34 +08:00
Charlie Fu	e85829450d	[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 ) Signed-off-by: charlifu <charlifu@amd.com>	2025-03-31 04:42:18 -07:00
Chengyang LIU	18ed3132d2	[Misc] update the comments (#15780 ) Signed-off-by: chengyang liu <lcy4869@gmail.com> Co-authored-by: chengyang liu <lcy4869@gmail.com>	2025-03-30 19:39:56 -07:00
Woosuk Kwon	9b459eca88	[V1][Scheduler] Avoid calling `_try_schedule_encoder_inputs` for every request (#15778 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-03-30 14:10:42 -07:00
yihong	70fedd0f79	fix: Comments to English for better dev experience (#15768 ) Signed-off-by: yihong0618 <zouzou0208@gmail.com>	2025-03-30 10:47:57 -07:00
kYLe	bb103b29bf	[Bugfix] Added `embed_is_patch` mask for fuyu model (#15731 ) Signed-off-by: Kyle Huang <kylhuang@nvidia.com>	2025-03-30 03:45:08 -07:00
Cyrus Leung	803d5c35f3	[V1] Override `mm_counts` for dummy data creation (#15703 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-30 03:20:42 -07:00
pansicheng	7fd8c0f85c	fix test_phi3v (#15321 ) Signed-off-by: pansicheng <sicheng.pan.chn@gmail.com>	2025-03-30 02:01:34 -07:00
Julien Denize	6909a76201	[Bugfix] Fix Mistral guided generation using xgrammar (#15704 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai>	2025-03-29 20:20:19 -07:00
Isotr0py	3c0ff914ac	[Bugfix] Fix Mllama interleaved images input support (#15564 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com>	2025-03-29 18:11:15 +00:00
Woosuk Kwon	2bc4be4e32	[V1][Minor] Simplify rejection sampler's parse_output (#15741 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-03-29 09:25:17 -07:00
Roger Wang	c67abd614f	[V1] Support interleaved modality items (#15605 ) Signed-off-by: Roger Wang <ywang@roblox.com>	2025-03-29 06:30:09 -07:00
shangmingc	6fa7cd3dbc	[Feature][Disaggregated] Support XpYd disaggregated prefill with MooncakeStore (#12957 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-03-29 04:01:46 -07:00
wwl2755	94744ba41a	[V1] [Feature] Collective RPC (#15444 ) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>	2025-03-29 03:39:14 -07:00
TJian	4965ec42d2	[FEAT] [ROCm] Add AITER int8 scaled gemm kernel (#15433 ) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-03-29 03:33:56 -07:00
yarongmu-google	7c1f760024	[Kernel][TPU][ragged-paged-attn] vLLM code change for PR#8896 (#15659 ) Signed-off-by: Yarong Mu <ymu@google.com>	2025-03-28 21:13:15 -07:00
Nicolò Lucchesi	da461f3cbf	[TPU][V1][Bugfix] Fix w8a8 recompiilation with GSM8K (#15714 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-03-28 21:13:06 -07:00
Jinzhen Lin	5b800f0932	[Bugfix] set VLLM_WORKER_MULTIPROC_METHOD=spawn for vllm.entrypoionts.openai.api_server (#15700 ) Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>	2025-03-28 21:12:26 -07:00
Varun Sundar Rabindranath	1286211f57	[Bugfix] LoRA V1: add and fix entrypoints tests (#15715 ) Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>	2025-03-28 21:10:41 -07:00
Nick Hill	6d531ad7b8	[Misc][V1] Misc code streamlining (#15723 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-03-28 20:59:47 -07:00
pengyuange	de1cb38769	[Model] Support Skywork-R1V (#15397 ) Signed-off-by: jiacai.liu <932997367@qq.com> Co-authored-by: jiacai.liu <932997367@qq.com>	2025-03-28 20:39:21 -07:00
daniel-salib	f3f8d8fff4	implement prometheus fast-api-instrumentor for http service metrics (#15657 )	2025-03-29 00:12:02 +00:00
Reid	26df46ee59	[Misc] cli auto show default value (#15582 ) Signed-off-by: reidliu41 <reid201711@gmail.com>	2025-03-28 22:23:00 +00:00
Alexander Matveev	c3f687ac22	[V1] TPU - Fix the chunked prompt bug (#15713 ) Signed-off-by: Alexander Matveev <amatveev@redhat.com>	2025-03-28 20:19:04 +00:00
Luka Govedič	04437e313d	[Bugfix] [torch.compile] Add Dynamo metrics context during compilation (#15639 ) Signed-off-by: luka <luka@neuralmagic.com>	2025-03-28 14:01:09 -06:00
Robert Shaw	038bededba	[TPU] [Perf] Improve Memory Usage Estimation (#15671 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com>	2025-03-28 17:37:52 +00:00
shangmingc	d03308be0c	[Misc] Remove stale func in KVTransferConfig (#14746 ) Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>	2025-03-28 17:33:32 +00:00
Cyrus Leung	c6bc0034d0	[Misc] Remove unused utils and clean up imports (#15708 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-28 09:41:16 -07:00
Kebe	432cf22a6a	[Bugfix] Fix regex compile display format (#15368 ) Signed-off-by: Kebe <mail@kebe7jun.com>	2025-03-28 08:58:44 -07:00
Russell Bryant	7329ff5468	[V1] Support disable_any_whtespace for guidance backend (#15584 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-03-28 23:46:45 +08:00
Cyrus Leung	541d1df486	[Bugfix] `embed_is_patch` for Idefics3 (#15696 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-28 08:27:52 -07:00
Chauncey	3b00ff9138	[Bugfix][v1] xgrammar structured output supports Enum. (#15594 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-03-28 06:14:53 -07:00
Jee Jee Li	91276c5721	[Model] Adding torch compile annotations to chatglm (#15624 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-03-28 21:14:09 +08:00
Reid	fd5fd26902	[Frontend] update priority for --api-key and VLLM_API_KEY (#15588 ) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>	2025-03-28 19:40:12 +08:00
Ce Gao	3bbaacbe15	[Bugfix][Frontend] Eliminate regex based check in reasoning full generator (#14821 ) Signed-off-by: Ce Gao <cegao@tensorchord.ai>	2025-03-28 11:20:35 +00:00
Jee Jee Li	70f2c2a709	[Bugfix] Fix 'InductorAdaptor object has no attribute 'cache_dir' (#15674 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-03-28 17:10:40 +08:00
Ce Gao	32b14baf8a	[Refactor][Frontend] Keep all logic about reasoning into one class (#14428 ) Signed-off-by: Ce Gao <cegao@tensorchord.ai>	2025-03-28 00:23:30 -07:00
Cyrus Leung	355f66348c	[V1] Remove legacy input registry (#15673 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-27 23:34:34 -07:00
Cyrus Leung	8693e47e6a	[Bugfix] Fix `mm_hashes` forgetting to be passed (#15668 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-03-28 05:51:05 +00:00
Jason (Siyu) Zhu	cec8c7d7f8	Refactor error handling for multiple exceptions in preprocessing (#15650 ) Signed-off-by: JasonZhu1313 <jasonchu13@outlook.com>	2025-03-28 03:27:20 +00:00
Gregory Shtrasberg	4d0ec37267	[Quantization][FP8] Adding support for fp8 gemm layer input in fp8 (#14578 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-03-28 02:58:16 +00:00
Wes	4ae17bf1e2	Revert "Use Cache Hinting for fused_moe kernel (#15511 )" (#15645 ) Signed-off-by: Wes Medford <wryanmedford@gmail.com>	2025-03-27 19:45:55 -07:00

1 2 3 4 5 ...

3754 Commits