xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-14 07:17:11 +08:00

Author	SHA1	Message	Date
gh-wf	36c9ce2554	Ensure minimum frames for GLM 4.6V compatibility (#30285 ) Signed-off-by: Wayne Ferguson <wayneferguson@gmail.com>	2025-12-11 05:26:49 +00:00
xyDong0223	1a516557e1	[Doc] Add Baidu Kunlun XPU support (#30455 ) Signed-off-by: xyDong0223 <dongxinyu23@gmail.com>	2025-12-11 04:52:17 +00:00
Wentao Ye	d6464f2679	[Chore] Fix torch precision warning (#30428 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-12-11 04:05:56 +00:00
Cyrus Leung	7e24e5d4d6	[Deprecation] Remove deprecated task, seed and MM settings (#30397 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:59:39 -08:00
Cyrus Leung	5a87d8b9b1	[Deprecation] Remove deprecated plugin and compilation fields for v0.13 release (#30396 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:59:35 -08:00
Divakar Verma	d1e1fb4363	[Bugfix] Fix grouped_topk pytorch impl when num_experts can't be grouped properly (#29439 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>	2025-12-10 19:47:18 -08:00
Andreas Karatzas	b51255f369	[ROCm] Fix broken import in platform attention backend dispatching (#30432 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-11 01:12:58 +00:00
Sage Moore	b4054c8ab4	Revert "[CI] Add Async Eplb nightly CI tests (#29385 )" (#30431 )	2025-12-11 00:48:35 +00:00
Xu Song	25221b44bb	Add more docs for regex (#30106 ) Signed-off-by: Xu Song <xusong.vip@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-11 00:12:21 +00:00
shivampr	8580919ac3	[Bugfix] fix confusing OOM errors during v1 init (#28051 ) Signed-off-by: Shivam <shivamprasad91@gmail.com> Signed-off-by: shivampr <shivampr.dev@gmail.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com>	2025-12-10 23:17:41 +00:00
Christina Norman	166ac3c94d	fix(shm): Add memory barriers for cross-process shared memory visibility (#30407 ) Signed-off-by: Christina Holland <hey@christinaholland.com> Signed-off-by: Christina <truffle@gmail.com>	2025-12-10 23:01:19 +00:00
Seiji Eicher	b9e0951f96	[docs] Improve wide-EP performance + benchmarking documentation (#27933 ) Signed-off-by: Seiji Eicher <seiji@anyscale.com>	2025-12-10 22:15:54 +00:00
Michael Goin	fcb894222f	[Docs] Update EPLB docs (#30426 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-10 11:56:51 -09:00
Nick Hill	6ccb7baeb1	[LMCache] Fix breakage due to new LMCache version (#30216 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-12-10 11:52:01 -08:00
Po-Han Huang (NVIDIA)	eea41804a4	[bug] Fix "Current vLLM config is not set." warnings when FlashInfer attention is used (#30241 ) Signed-off-by: Po-Han Huang <pohanh@nvidia.com>	2025-12-10 11:18:51 -08:00
Jialin Ouyang	9f042ba26b	[Perf] Enable environment cache in EngineCore to enable the feature for UniProcExecutor as well (#29289 ) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>	2025-12-10 14:13:01 -05:00
Cyrus Leung	e72d65b959	{Deprecation] Remove tokenizer setter (#30400 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:10:58 +00:00
Will Eaton	a9e4106f28	[P/D] KV Load Failure Recovery/Abort Configuration (#26813 ) Signed-off-by: Will Eaton <weaton@redhat.com> Signed-off-by: Will Eaton <me@wseaton.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-10 11:00:52 -08:00
Anker	e8e8cd73e5	[Bugfix] Fix HunyuanOCR cross-image contamination in batch processing (#30344 ) Signed-off-by: Lennart Brog <lennart.borg@list-ag.de> Signed-off-by: Anker <20343812+anker-c2@users.noreply.github.com>	2025-12-10 18:09:31 +00:00
Cyrus Leung	253305d5b2	[Chore] Delay recent deprecations (#30398 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 17:48:38 +00:00
Matthew Bonanni	794a7875ee	[Misc] Consistent case for `vllm bench serve` results (#30403 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-12-10 09:44:02 -08:00
Mark McLoughlin	2dcbac9077	[Docs] Generate full list of metrics in user docs (#30388 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-10 16:09:34 +00:00
Lucas Wilkinson	aacf0abf8b	[BugFix] Fix `AttributeError: 'MergedColumnParallelLinear' object has no attribute 'weight_scale'` (#30399 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-12-10 07:59:23 -08:00
Nicolò Lucchesi	c756fb6781	[Core] Whisper enable `FULL_DECODE_ONLY` CudaGraph (#30072 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-12-10 06:14:24 -08:00
Roger Young	d017bceb08	[BugFix] Fix minimax m2 model rotary_dim (#30384 ) Signed-off-by: xuebi <xuebi@minimaxi.com> Co-authored-by: xuebi <xuebi@minimaxi.com>	2025-12-10 04:58:50 -08:00
Aditya Tewari	cebda2a4af	[CPU] Support for Whisper (#30062 ) Signed-off-by: Aditya Tewari <aditya.tewari@arm.com>	2025-12-10 04:58:42 -08:00
Daniele	53d2420b44	[Bugfix] tpu_model_runner: set vllm config context when calling reset_dynamo_cache() (#30331 ) Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>	2025-12-10 04:58:35 -08:00
Chauncey	9db78f34dc	[Bugfix] Fix the issue where DeepSeek v3.2 cannot use structured_output (#30371 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-10 08:30:16 +00:00
Fadi Arafeh	434ac76a7c	[cpu][ci] Add CPU Attention Tests for Neon Backend (#30347 ) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-12-10 05:37:35 +00:00
Andreas Karatzas	ed7af3178a	[ROCm][CI] Attempt to fix the failures under a subgroup of the e2e the test group (#29358 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com> Signed-off-by: Micah Williamson <micah.williamson@amd.com> Co-authored-by: Micah Williamson <micah.williamson@amd.com>	2025-12-10 05:33:13 +00:00
Radu Salavat	180345807f	[CMake][Build]: Remove unused ACL CMake env variables (#30339 ) Signed-off-by: Radu Salavat <radu.salavat@arm.com>	2025-12-10 04:27:19 +00:00
Mingliang Li	d007387aa7	[Bugfix] Cache added_vocab to avoid per-token overhead (#30351 ) Signed-off-by: limingliang <limingliang@stepfun.com> Co-authored-by: limingliang <limingliang@stepfun.com>	2025-12-10 12:05:51 +08:00
Wilson Wu	3bdd426636	Fix typos in comments across multiple files (#30345 ) Signed-off-by: Wilson Wu <iwilsonwu@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-12-09 20:05:28 -08:00
haoyangli-amd	06462392e4	[bugfix][quantization] fix quark qwen3 kv_cache quantization (#30308 ) Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>	2025-12-10 03:24:12 +00:00
Micah Williamson	7d80c73d42	[CI] Reduce Flakiness For test_spec_decode.py::test_suffix_decoding_acceptance (#30367 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-12-10 02:35:49 +00:00
rasmith	b75f826fca	[CI/Build][AMD] Skip quantization kernels tests that require CUTLASS or e4m3fn when not supported by platform (#30020 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-12-10 02:28:37 +00:00
Andrew Xia	c3487aca34	[responsesAPI][6] Fix multi turn MCP tokenization (#30230 ) Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>	2025-12-10 10:13:13 +08:00
Lucas Wilkinson	abe93bce59	[Attention] Make seq_lens_cpu optional in CommonAttentionMetadata to enable true async spec-decode (#29624 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Benjamin Chislett <chislett.ben@gmail.com>	2025-12-09 17:18:10 -08:00
ElizaWszola	2e7035dd8c	[Bugfix] Fix fp8 DeepGemm compilation issues (#30336 )	2025-12-09 20:17:25 -05:00
PatrykSaffer	4c2e10ea19	[Bugfix] Fix cuda graph sizes when running with speculative decoding (#30330 ) Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com> Signed-off-by: PatrykSaffer <patryk.saffer@mistral.ai> Co-authored-by: Patryk Saffer <patryk.saffer99@gmail.com>	2025-12-10 00:47:07 +00:00
dongbo910220	03b5f940fd	[V1][Spec Decode] Optimize Medusa proposer to avoid GPU-CPU sync (#29723 ) Signed-off-by: dongbo910220 <1275604947@qq.com>	2025-12-10 00:15:01 +00:00
Hashem Hashemi	2e7054da06	Improve wvsplitK tile and balance heristics. (#29937 ) Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>	2025-12-09 23:51:32 +00:00
Charlie Fu	3c680f4a17	[Rocm][torch.compile] Adding layernorm + fp8 block quant and silu + fp8 block quant for Aiter (#25693 ) Signed-off-by: charlifu <charlifu@amd.com> Signed-off-by: Micah Williamson <micah.williamson@amd.com> Signed-off-by: Charlie Fu <Charlie.Fu@amd.com> Co-authored-by: Micah Williamson <micah.williamson@amd.com> Co-authored-by: wuhuikx <hattie.wu@amd.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>	2025-12-09 22:39:26 +00:00
Kyle Sayers	fccd532587	[Quantization] FP8 Weight Reloading for Quantized RL Rollout (#28480 ) Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>	2025-12-09 13:54:32 -08:00
bnellnm	00e5cbb967	[MoE][Refactor] Remove most arguments to FusedMoEMethodBase.apply (#29066 ) Signed-off-by: Bill Nell <bnell@redhat.com> Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>	2025-12-09 13:48:25 -08:00
rasmith	7618dc973d	[CI/Build] Make test_mha_attn.py run on correct platform only and check for flash_attn_varlen_func in layer.py (#29145 )	2025-12-09 20:18:17 +00:00
dependabot[bot]	f8dacc66b6	Bump actions/stale from 10.1.0 to 10.1.1 (#30234 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-09 20:12:14 +00:00
dependabot[bot]	7cab92fd45	Bump actions/checkout from 6.0.0 to 6.0.1 (#30233 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-09 20:03:16 +00:00
Tsukasa OI	73a484caa1	[Model][Quantization] Fix / Add GGUF support for Qwen2 MoE models (#30307 ) Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>	2025-12-09 19:13:10 +00:00
Lucas Wilkinson	b37bf51e75	[CI/Test] Fix FP8 per-tensor quant test reference scale shape (#30352 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-12-09 12:52:20 -06:00

1 2 3 4 5 ...

12241 Commits