xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 01:07:17 +08:00

Author	SHA1	Message	Date
Boyuan Feng	b158df2813	remove resolve_op_overloads and use splitting_ops directly (#28081 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-11-08 01:13:13 +00:00
Kunshang Ji	1aaecda078	[XPU] Enable Expert parallel for MoE models (#28263 ) Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-11-08 00:33:11 +00:00
Harry Mellor	811df41ee9	Update Flashinfer from `v0.4.1` to `v0.5.2` (#27952 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-07 16:24:42 -08:00
Nick Hill	67a2da890e	[PerfFix] Avoid separate thread for MP executor shm spin (take 2) (#28319 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-07 22:11:03 +00:00
Nick Hill	da786e339e	[Core] Rework handling of async scheduling config (#28250 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-07 20:01:23 +00:00
Benjamin Chislett	18903216f5	[Bugfix] Fix and add tests for GptOss reasoning parser (#28000 ) Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>	2025-11-07 19:28:04 +00:00
Simon Mo	d0ceb38ae8	[Build] Fix release pipeline failing annotation (#28272 ) Signed-off-by: simon-mo <simon.mo@hey.com> Signed-off-by: Simon Mo <simon.mo@hey.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-07 10:06:45 -08:00
youkaichao	155ad56d7b	[doc] add guide about the provided PTX was compiled with an unsupported toolchain (#28305 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-11-08 00:26:34 +08:00
Fadi Arafeh	5fb4137c99	[README] Add Arm CPUs to the list of supported targets (#28290 ) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-11-07 15:41:47 +00:00
Nicolò Lucchesi	68a72a5cc1	Revert "[PerfFix] Avoid separate thread for MP executor shm spin (#28012 )" (#28289 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-11-07 15:07:01 +00:00
Boyuan Feng	0f872b7977	[Log] update shm wait time msg (#28255 )	2025-11-07 09:43:30 -05:00
Wentao Ye	4b1ff13221	[Feature] Default `ignore_eos` True for `random` dataset (#28227 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-07 07:35:33 -05:00
Iceber Gu	e0d6b4a867	[CLI] add --max-tokens to `vllm complete` (#28109 ) Signed-off-by: Iceber Gu <caiwei95@hotmail.com>	2025-11-07 12:21:40 +00:00
Pavani Majety	72b1c2ae2c	[Bugfix] Use latency MOE backend as default for Flashinfer and other misc fixes (#27439 ) Signed-off-by: Pavani Majety <pmajety@nvidia.com>	2025-11-07 04:18:39 -08:00
Lukas Geiger	e0919f331d	[Core][MM] Add mechanism to configure multimodal fields which should stay on CPU (#28168 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-07 12:14:29 +00:00
Kevin H. Luu	8e19d470af	[fix] Revert "fixing mm placeholder replacement issue with gemma3" (#28285 ) Signed-off-by: Kevin H. Luu <khluu000@gmail.com>	2025-11-07 12:09:09 +00:00
Mengqing Cao	1958bda9b4	[Misc][Model][Refactor] Pass the prefix into Linear layers (#28259 ) Signed-off-by: MengqingCao <cmq0113@163.com>	2025-11-07 19:38:38 +08:00
Zhang Xiangze	7bdb42b2f2	[CPU]Avoid repeated random sample compile (#28260 ) Signed-off-by: Zhang Xiangze <Xiangze.Zhang@arm.com>	2025-11-07 11:03:57 +00:00
汪志鹏	315068eb4a	[FixBug]Aeala/ShareGPT_Vicuna_unfiltered marked as multimodal benchmark (#28265 ) Signed-off-by: princepride <wangzhipeng628@gmail.com>	2025-11-07 09:35:22 +00:00
Jialin Ouyang	ccd98b59c1	[Perf] Introduce FlattenLogprobs to store logprobs results to reduce GC overhead (#28171 ) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>	2025-11-07 00:27:12 -08:00
Jee Jee Li	21b82f4ea2	[Kernel] LoRA triton kernels support PDL (#27402 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-11-07 08:05:48 +00:00
Copilot	a736e5ff77	[CI] Reduce Blackwell Fusion test runtime by filtering tests and only run all tests in nightly (#28074 )	2025-11-07 15:58:16 +08:00
baonudesifeizhai	9da9208b20	[Bug] Fix missing token_ids for reasoning parser models in chat completions #28246 (#28256 )	2025-11-07 07:31:58 +00:00
smit kadvani	11fd69dd54	[amd][gptoss] Perf gain because of block alignment (#28024 ) Signed-off-by: Smit Kadvani <smit.kadvani@gmail.com> Co-authored-by: Smit Shaileshbhai Kadvani <kadvani@meta.com>	2025-11-07 05:27:42 +00:00
Harry Mellor	c0a4b95d64	Fix issues from #28242 (#28257 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-07 04:23:17 +00:00
Alexis MacAskill	a47d94f18c	Add runai model streamer e2e test for GCS (#28079 ) Signed-off-by: Alexis MacAskill <amacaskill@google.com>	2025-11-07 03:07:54 +00:00
Alex Brooks	e70fbc599b	[CI/Build] Loosen STT LoRA Translate Check (Flaky Test) (#28247 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex Brooks <alex.brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-11-07 02:51:27 +00:00
Lucas Kabela	4bf56c79cc	[Multimodal][torch.compile] Add compilation config field for turning off ViT/MM compile (#28242 ) Signed-off-by: Lucas Kabela <lucaskabela@meta.com>	2025-11-07 00:16:03 +00:00
Junhong Liu	59b453eaa2	Speed up mm processor kwargs per request by spliting dynamic and static kwargs (#26483 ) Signed-off-by: Junhong <liujunhong11@huawei.com> Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com> Co-authored-by: Junhong <liujunhong11@huawei.com>	2025-11-07 07:51:28 +08:00
Eugene Khvedchenya	827e4237bc	Fix failing test for CRadio (#27738 ) Signed-off-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: wang.yuqi <noooop@126.com>	2025-11-06 15:32:25 -08:00
Varun Sundar Rabindranath	ca6f755d24	[BugFix] Fix FusedMoELoRA + ModularKernel Integration (#28237 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-11-06 22:53:30 +00:00
Matthew Bonanni	ca90f50304	[Test] Add non-MoE DP test coverage (#28235 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-11-06 20:59:57 +00:00
Fang Han	da855b42d2	[Doc]: Make extraInit containers fully configurable in helm chart (#27497 ) Signed-off-by: Fang Han <fhan0520@gmail.com>	2025-11-06 20:27:16 +00:00
Aleksandr Malyshev	449de9001a	[ROCm] triton fp8 kernel (#27058 ) Signed-off-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>	2025-11-06 14:46:44 -05:00
Vico Chu	d4aa65c998	[Chore] eliminate duplicated and unconditional object serialization in anthropic messages api (#27792 ) Signed-off-by: Vico Chu <vico24826@gmail.com>	2025-11-06 19:09:19 +00:00
Julien Denize	7a8375f8a0	Add llama 4 scaling support (#28145 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai>	2025-11-06 18:55:17 +00:00
Andy Lo	5e0c1fe69c	[Structured outputs] Upgrade llguidance to 1.3.0 (#28039 ) Signed-off-by: Andy Lo <andy@mistral.ai> Co-authored-by: Russell Bryant <rbryant@redhat.com>	2025-11-06 10:24:47 -08:00
Russell Bryant	4507a6dae4	CODEOWNERS: Add myself as reviewer on security docs (#28216 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-11-06 17:39:42 +00:00
Roy Wang	d1dd5f53e4	[Frontend] Fix logging format when enable response logging (#28049 ) Signed-off-by: esmeetu <jasonailu87@gmail.com>	2025-11-06 16:25:39 +00:00
StanHatko	e52e4da971	[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores (#27953 ) Signed-off-by: Stan Hatko <stan_hatko@live.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com>	2025-11-06 23:47:11 +08:00
Milos Puzovic	2176778cd3	[Doc] Add Arm CPUs are on the list of supported targets in vLLM (#26018 ) Signed-off-by: Milos Puzovic <milos.puzovic@arm.com>	2025-11-06 15:30:26 +00:00
Eric Yue	0370679ce9	[Kernel][Model] Tune fused_moe Triton configs for MiniMax-M2 on H100 (#28200 ) Signed-off-by: minatoaquaMK2 <jiacheng.yue@foxmail.com>	2025-11-06 07:29:46 -08:00
Harry Mellor	8816e375d3	[Docs] Switch to directory style URLs (#28058 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-06 07:06:33 -08:00
Michael Goin	f32229293e	Disable nm-testing models with issues in CI (#28206 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-11-06 06:19:07 -08:00
xiangze-arm	c757a15f0f	[CPU]Improve cpu fused moe perf (#27244 ) Signed-off-by: Zhang Xiangze <Xiangze.Zhang@arm.com>	2025-11-06 11:04:18 +00:00
Chauncey	59a50afa08	[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony (#26874 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-11-06 10:40:03 +00:00
courage17340	981cadb35c	[Bugfix][Kernel] fix merge attn states when both prefix and suffix are empty (#28181 ) Signed-off-by: courage17340 <courage17340@163.com>	2025-11-06 17:52:13 +08:00
wangxiyuan	c3ee80a01a	[V0 deprecation]clean up is_v1_supported_oracle (#28116 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-06 16:05:32 +08:00
Aditya Tewari	3755c14532	[CPU] Enable torch profiling (#28130 ) Signed-off-by: Aditya Tewari <aditya.tewari@arm.com>	2025-11-06 07:32:05 +00:00
Seungduk Kim	201dc98acc	Fix hard-coded parameter name in gemma3n.py (#27946 ) Signed-off-by: Seungduk Kim <seungduk.kim@yanolja.com> Signed-off-by: Biswa Panda <biswa.panda@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Biswa Panda <biswa.panda@gmail.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>	2025-11-05 23:07:36 -08:00

1 2 3 4 5 ...

11076 Commits