xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-07 18:37:09 +08:00

Author	SHA1	Message	Date
Copilot	a736e5ff77	[CI] Reduce Blackwell Fusion test runtime by filtering tests and only run all tests in nightly (#28074 )	2025-11-07 15:58:16 +08:00
baonudesifeizhai	9da9208b20	[Bug] Fix missing token_ids for reasoning parser models in chat completions #28246 (#28256 )	2025-11-07 07:31:58 +00:00
smit kadvani	11fd69dd54	[amd][gptoss] Perf gain because of block alignment (#28024 ) Signed-off-by: Smit Kadvani <smit.kadvani@gmail.com> Co-authored-by: Smit Shaileshbhai Kadvani <kadvani@meta.com>	2025-11-07 05:27:42 +00:00
Harry Mellor	c0a4b95d64	Fix issues from #28242 (#28257 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-07 04:23:17 +00:00
Alexis MacAskill	a47d94f18c	Add runai model streamer e2e test for GCS (#28079 ) Signed-off-by: Alexis MacAskill <amacaskill@google.com>	2025-11-07 03:07:54 +00:00
Alex Brooks	e70fbc599b	[CI/Build] Loosen STT LoRA Translate Check (Flaky Test) (#28247 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Signed-off-by: Alex Brooks <alex.brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-11-07 02:51:27 +00:00
Lucas Kabela	4bf56c79cc	[Multimodal][torch.compile] Add compilation config field for turning off ViT/MM compile (#28242 ) Signed-off-by: Lucas Kabela <lucaskabela@meta.com>	2025-11-07 00:16:03 +00:00
Junhong Liu	59b453eaa2	Speed up mm processor kwargs per request by spliting dynamic and static kwargs (#26483 ) Signed-off-by: Junhong <liujunhong11@huawei.com> Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com> Co-authored-by: Junhong <liujunhong11@huawei.com>	2025-11-07 07:51:28 +08:00
Eugene Khvedchenya	827e4237bc	Fix failing test for CRadio (#27738 ) Signed-off-by: Eugene Khvedchenia <ekhvedchenia@nvidia.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: wang.yuqi <noooop@126.com>	2025-11-06 15:32:25 -08:00
Varun Sundar Rabindranath	ca6f755d24	[BugFix] Fix FusedMoELoRA + ModularKernel Integration (#28237 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-11-06 22:53:30 +00:00
Matthew Bonanni	ca90f50304	[Test] Add non-MoE DP test coverage (#28235 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-11-06 20:59:57 +00:00
Fang Han	da855b42d2	[Doc]: Make extraInit containers fully configurable in helm chart (#27497 ) Signed-off-by: Fang Han <fhan0520@gmail.com>	2025-11-06 20:27:16 +00:00
Aleksandr Malyshev	449de9001a	[ROCm] triton fp8 kernel (#27058 ) Signed-off-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Aleksandr Malyshev <maleksan@amd.com> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>	2025-11-06 14:46:44 -05:00
Vico Chu	d4aa65c998	[Chore] eliminate duplicated and unconditional object serialization in anthropic messages api (#27792 ) Signed-off-by: Vico Chu <vico24826@gmail.com>	2025-11-06 19:09:19 +00:00
Julien Denize	7a8375f8a0	Add llama 4 scaling support (#28145 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai>	2025-11-06 18:55:17 +00:00
Andy Lo	5e0c1fe69c	[Structured outputs] Upgrade llguidance to 1.3.0 (#28039 ) Signed-off-by: Andy Lo <andy@mistral.ai> Co-authored-by: Russell Bryant <rbryant@redhat.com>	2025-11-06 10:24:47 -08:00
Russell Bryant	4507a6dae4	CODEOWNERS: Add myself as reviewer on security docs (#28216 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-11-06 17:39:42 +00:00
Roy Wang	d1dd5f53e4	[Frontend] Fix logging format when enable response logging (#28049 ) Signed-off-by: esmeetu <jasonailu87@gmail.com>	2025-11-06 16:25:39 +00:00
StanHatko	e52e4da971	[HARDWARE][CPU] Add Option for Disabling Binding to Specific CPU Cores (#27953 ) Signed-off-by: Stan Hatko <stan_hatko@live.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com>	2025-11-06 23:47:11 +08:00
Milos Puzovic	2176778cd3	[Doc] Add Arm CPUs are on the list of supported targets in vLLM (#26018 ) Signed-off-by: Milos Puzovic <milos.puzovic@arm.com>	2025-11-06 15:30:26 +00:00
Eric Yue	0370679ce9	[Kernel][Model] Tune fused_moe Triton configs for MiniMax-M2 on H100 (#28200 ) Signed-off-by: minatoaquaMK2 <jiacheng.yue@foxmail.com>	2025-11-06 07:29:46 -08:00
Harry Mellor	8816e375d3	[Docs] Switch to directory style URLs (#28058 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-06 07:06:33 -08:00
Michael Goin	f32229293e	Disable nm-testing models with issues in CI (#28206 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-11-06 06:19:07 -08:00
xiangze-arm	c757a15f0f	[CPU]Improve cpu fused moe perf (#27244 ) Signed-off-by: Zhang Xiangze <Xiangze.Zhang@arm.com>	2025-11-06 11:04:18 +00:00
Chauncey	59a50afa08	[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony (#26874 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-11-06 10:40:03 +00:00
courage17340	981cadb35c	[Bugfix][Kernel] fix merge attn states when both prefix and suffix are empty (#28181 ) Signed-off-by: courage17340 <courage17340@163.com>	2025-11-06 17:52:13 +08:00
wangxiyuan	c3ee80a01a	[V0 deprecation]clean up is_v1_supported_oracle (#28116 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-06 16:05:32 +08:00
Aditya Tewari	3755c14532	[CPU] Enable torch profiling (#28130 ) Signed-off-by: Aditya Tewari <aditya.tewari@arm.com>	2025-11-06 07:32:05 +00:00
Seungduk Kim	201dc98acc	Fix hard-coded parameter name in gemma3n.py (#27946 ) Signed-off-by: Seungduk Kim <seungduk.kim@yanolja.com> Signed-off-by: Biswa Panda <biswa.panda@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Biswa Panda <biswa.panda@gmail.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>	2025-11-05 23:07:36 -08:00
Julien Denize	a404e2c0f1	Patch Mistral Tokenizer (#28146 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai>	2025-11-06 06:43:16 +00:00
Xiaozhu Meng	e31946f86e	[flashinfer] fix FI all2all with FI cutlass moe (#28166 ) Signed-off-by: Xiaozhu <mxz297@gmail.com>	2025-11-06 05:52:16 +00:00
gmagogsfm	bde5039325	[CI] Add compile/test_multimodal_compile.py to CI (#28151 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-06 05:41:47 +00:00
Jacob Zhong	d72299d47b	Make the cv2 dependency optional (#27780 ) Signed-off-by: Jacob <cmpute@qq.com>	2025-11-06 05:08:55 +00:00
Lukas Geiger	80679f108f	[Core][MM] Use non-blocking CPU-GPU copy of multimodal data (#28141 ) Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>	2025-11-06 04:05:12 +00:00
Isotr0py	43ecd0a900	[Chore] Clean up deepseek v2/v3 config copy (#28055 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-06 03:46:30 +00:00
Chauncey	07d614511f	[Misc] Remove the duplicate code (#28111 ) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-11-05 21:07:47 -05:00
Vadim Gimpelson	f948ab6945	[CI Failure] `nm-testing/Qwen2-0.5B-Instruct-FP8-SkipQKV` was removed from HF. Skip it in tests (#28170 ) Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>	2025-11-06 01:22:13 +00:00
Wentao Ye	d71af5f502	[Feature] Enable TP + EP `shared_experts` overlap with router, 3.7% E2E performance improvement (#28164 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-05 17:21:08 -08:00
Wentao Ye	90189c71a9	[Bug] Fix env string `"0"` same to `True` (#28159 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-05 17:04:20 -08:00
Wentao Ye	d79d9f0780	[Bug] Fix cpu disable shared_experts `VLLM_DISABLE_SHARED_EXPERTS_STREAM` (#28157 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-05 17:03:09 -08:00
Vadim Gimpelson	b6a248bdd7	[PERF] Decouple projections from GDN custom op. Attempt 2 (#28083 ) Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>	2025-11-05 17:01:12 -08:00
Dayeol Lee	1767658559	[Debugging] Add annotation for easier trace analysis (#22496 )	2025-11-05 16:52:52 -08:00
Kuntai Du	efe73e9b57	[Core][Hybrid allocator + connector 2/n] Unify `remove_skipped_blocks` by `get_last_useful_token` (#25431 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu>	2025-11-06 00:12:00 +00:00
Zhewen Li	0b8e871e5e	[CI/Build] Fix `test_defaults_with_usage_context` in AMD CI (#27926 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-05 15:40:24 -08:00
Zhewen Li	5ee93a5956	[CI/Build] Update checking logic in cutlass_group_gemm_supported (#27948 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-05 15:40:10 -08:00
Snehlata	e15601789b	[Feature]: Add corrupted request metric to V1 metrics system. (#27306 ) Signed-off-by: atalhens <sneh.lata@nutanix.com>	2025-11-05 13:45:29 -08:00
Richard Zou	65ac8d8dc4	[Docs] Add guide to debugging vLLM-torch.compile integration (#28094 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-11-05 21:31:46 +00:00
Isotr0py	ffb08379d8	[Chore] Remove Nemotron-Nano-VL config copy (#28126 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-05 20:06:45 +00:00
R3hankhan	e04492449e	[Hardware][IBM Z] Optimize s390x Dockerfile (#28023 ) Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>	2025-11-05 11:25:44 -08:00
Michael Yao	518ec6b722	[Docs] Clean up README_TUNING.md (#28088 ) Signed-off-by: windsonsea <haifeng.yao@daocloud.io>	2025-11-05 19:01:34 +00:00

1 2 3 4 5 ...

11055 Commits