xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-11 06:25:51 +08:00

Author	SHA1	Message	Date
Jee Jee Li	463074fac8	Merge branch 'main' into mlm-full-lora-support	2025-12-20 08:25:41 +08:00
Zhonghua Deng	969bbc7c61	[Model] Add MiMo-V2-Flash support (#30836 ) Signed-off-by: Abatom <abzhonghua@gmail.com> Signed-off-by: Jumiar <liuanqim10@126.com> Signed-off-by: Zyann7 <zyann7@outlook.com> Co-authored-by: Jumiar <liuanqim10@126.com> Co-authored-by: Zyann7 <zyann7@outlook.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-19 17:17:03 +00:00
Elizabeth Thomas	41b6f9200f	Remove all2all backend envvar (#30363 ) Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-18 19:46:28 +00:00
Lucas Wilkinson	30bb19a760	[BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support) (#30910 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-17 23:50:15 -08:00
Zhengxu Chen	5f2f3fba1d	[compile] Fix CI for test_gpt2_cache_hit (#30902 ) Signed-off-by: zhxchen17 <zhxchen17@fb.com>	2025-12-17 20:22:23 -08:00
SungMinCho	a0b782f9cc	[Metrics] Model FLOPs Utilization estimation (#30738 ) Signed-off-by: SungMinCho <tjdals4565@gmail.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Mark McLoughlin <markmc@redhat.com>	2025-12-18 01:40:51 +00:00
Jee Jee Li	94dce5c3d9	Merge branch 'main' into mlm-full-lora-support	2025-12-17 00:33:42 +08:00
Boyuan Feng	104003dc77	update piecewise cudagraph warning when splitting_ops=[] (#30728 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-12-16 06:09:34 -08:00
B-201	bdac2b5d17	Merge branch 'main' into mlm-full-lora-support	2025-12-16 19:13:22 +08:00
jiangkuaixue123	b9ff4f2a8d	[feature] extend DBO to XBO (#30120 ) Signed-off-by: jiangkuaixue123 <jiangxiaozhou111@163.com> Co-authored-by: root <root@hk01dgx028.cm.cluster>	2025-12-16 00:04:01 -05:00
Michael Goin	a450c64a30	[Bugfix] Fail instead of ignoring when CompilationConfig gets invalid args (#30708 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-15 20:18:02 +00:00
Harry Mellor	970713d4a4	Remove `SkipValidation` from `ModelConfig` (#30695 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-15 17:34:08 +00:00
Nicolò Lucchesi	185c22bf2f	[Misc][Hybrid allocator + kv connector] Optionally enable hybrid allocator + KV cache connector (#29805 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-12-15 11:17:58 +00:00
wang.yuqi	4429d934de	[Model] Automatic conversion of TokenClassification model (#30666 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-15 08:13:00 +00:00
Boyuan Feng	917fdae5b2	[Log] Skip piecewise cudagraph warn when using full cudagraph (#30657 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-12-15 02:49:45 +00:00
yifant-code	5ccf0efa84	[Bugfix] Improve error messages in ModelConfig validation (#30213 ) Signed-off-by: ytian218 <ytian218@bloomberg.net> Co-authored-by: ytian218 <ytian218@bloomberg.net>	2025-12-14 21:23:37 +08:00
Jee Jee Li	35acd22a5d	Move forward Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-12 08:53:09 +00:00
Jee Jee Li	421707dec1	Merge branch 'main' into mlm-full-lora-support	2025-12-12 15:00:59 +08:00
Jee Jee Li	208dc0c954	Fix comments Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-12 00:05:07 +00:00
Nicolò Lucchesi	0efd9f867c	[Core] Whisper Enable Encoder Batching (#29421 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-12-11 21:06:51 +00:00
Harry Mellor	cf3eacfe58	Standardise `get_rope` to use `rope_parameters["partial_rotary_factor"]`, not `rotary_dim` (#30389 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-11 20:45:23 +00:00
B-201	e10321bf6a	Merge branch 'main' into mlm-full-lora-support	2025-12-12 00:04:59 +08:00
bk-201	dd857e480f	Merge branch 'mlm-full-lora-support' of https://github.com/jeejeelee/vllm into mlm-full-lora-support	2025-12-11 16:02:37 +00:00
Qiu	a11f4a81e0	[Misc][PCP&DCP] relocate PCP feature check (#30050 ) Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-12-11 03:36:18 -08:00
wang.yuqi	a5f9fb5960	[Deprecation] Deprecation `--convert reward`, use `--convert embed` instead. (#30463 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-11 10:18:25 +00:00
bk-201	27448490f1	update argument name Signed-off-by: bk-201 <joy25810@foxmail.com>	2025-12-11 06:46:53 +00:00
Cyrus Leung	7e24e5d4d6	[Deprecation] Remove deprecated task, seed and MM settings (#30397 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:59:39 -08:00
Cyrus Leung	5a87d8b9b1	[Deprecation] Remove deprecated plugin and compilation fields for v0.13 release (#30396 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-10 19:59:35 -08:00
B-201	d1307e1d29	Merge branch 'main' into mlm-full-lora-support	2025-12-11 11:47:50 +08:00
Will Eaton	a9e4106f28	[P/D] KV Load Failure Recovery/Abort Configuration (#26813 ) Signed-off-by: Will Eaton <weaton@redhat.com> Signed-off-by: Will Eaton <me@wseaton.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: chaunceyjiang <chaunceyjiang@gmail.com>	2025-12-10 11:00:52 -08:00
Nicolò Lucchesi	c756fb6781	[Core] Whisper enable `FULL_DECODE_ONLY` CudaGraph (#30072 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-12-10 06:14:24 -08:00
bk-201	5ff0c6fb73	Merge remote-tracking branch 'origin/main' into mlm-full-lora-support	2025-12-10 07:10:58 +00:00
PatrykSaffer	4c2e10ea19	[Bugfix] Fix cuda graph sizes when running with speculative decoding (#30330 ) Signed-off-by: Patryk Saffer <patryk.saffer99@gmail.com> Signed-off-by: PatrykSaffer <patryk.saffer@mistral.ai> Co-authored-by: Patryk Saffer <patryk.saffer99@gmail.com>	2025-12-10 00:47:07 +00:00
Benjamin Chislett	e858bfe051	[Cleanup] Refactor profiling env vars into a CLI config (#29912 ) Signed-off-by: Benjamin Chislett <bchislett@nvidia.com> Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-09 13:29:33 -05:00
Laith Sakka	87aee9ed2b	Add evaluate_guards option to DynamicShapesConfig (#27432 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-12-08 10:46:15 -05:00
wang.yuqi	9e77ffca3f	[Model][7/N] Improve all pooling task \| Deprecation as_reward_model. Extract hidden states prefer using new multi-vector retrieval API (#26686 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-08 08:10:09 +00:00
Isotr0py	b952f4d3c3	[v1] Add PrefixLM support to FlexAttention backend (#27938 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-12-07 15:51:36 +00:00
Cyrus Leung	e83b7e379c	Revert "[Renderer] Separate out `RendererConfig` from `ModelConfig` (#30145 )" (#30199 )	2025-12-07 00:00:22 -08:00
Cyrus Leung	27f4c2fd46	[Renderer] Separate out `RendererConfig` from `ModelConfig` (#30145 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-06 23:15:42 -08:00
Wentao Ye	17eb25e327	[Perf] Enable cuda graph for deepepHT, 5.3% throughput improvement, 4.4% TTFT improvement (#29558 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-12-07 04:44:50 +00:00
Nick Hill	4026ae31e9	[Misc] Move `disable_nccl_for_dp_synchronization` init logic into `VllmConfig` (#30161 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-12-05 20:59:04 -08:00
Rohan Potdar	40a046cd82	[Bugfix]: Fix `TokenizerLike` interface (#30009 ) Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>	2025-12-05 20:56:40 -08:00
Harry Mellor	bf4a901af9	Better error when world size is larger than node and `distributed_executor_backend` is not set (#30140 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-05 20:53:52 -08:00
Bangsheng Tang	77e4472809	let draft model follow target model's config_format (#30152 )	2025-12-05 13:33:42 -08:00
Ilya Markov	4e26d3b09e	[Compile] Conditional compilation. Introduce compile_ranges (#24252 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Signed-off-by: Luka Govedič <luka.govedic@gmail.com> Signed-off-by: ProExpertProg <lgovedic@redhat.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com>	2025-12-05 18:17:32 +00:00
Matthew Bonanni	66e674cdd5	[Attention][UX][1/N] Add AttentionConfig and change attention env vars to CLI arguments (#26315 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>	2025-12-05 09:48:43 -08:00
Alec S	2c174420f5	Reduce validation to a warning (#28749 ) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-05 14:02:49 +00:00
B-201	1fbd7287b8	Merge branch 'main' into mlm-full-lora-support	2025-12-05 20:17:40 +08:00
bk-201	113eb2e0b8	add a enable option Signed-off-by: bk-201 <joy25810@foxmail.com>	2025-12-05 12:14:53 +00:00
Max Hu	c2894d3883	[Feature] Add Layer-wise NVTX Support (#29990 ) Signed-off-by: Max Hu <hyoung2991@gmail.com> Signed-off-by: Max Hu <maxhu@nvidia.com> Co-authored-by: Max Hu <maxhu@nvidia.com>	2025-12-05 11:20:07 +00:00

1 2 3 4 5 ...

379 Commits