xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-25 01:25:01 +08:00

Author	SHA1	Message	Date
Laith Sakka	1f0d184590	[aot_compile]change VLLM backend to read fake args from example_value (#29104 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-12-04 17:33:45 -05:00
elvischenv	afe9eb408e	[Bugfix] Fix flashinfer ar+norm kernel not available issue (#29960 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-12-03 18:50:53 +00:00
Yong Hoon Shin	69520bc695	Add logging for cudagraph related info (#29825 ) Signed-off-by: Yong Hoon Shin <yhshin@meta.com>	2025-12-03 01:01:48 -08:00
elvischenv	c719c40540	[Bugfix] Defunctionalize TRTLLM AR+Norm op for avoiding extra clone kernel before it (#29631 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-12-03 05:15:50 +00:00
Arpit Khandelwal	d7284a2604	[Core] Rename PassConfig flags as per RFC #27995 (#29646 ) Signed-off-by: arpitkh101 <arpit5khandelwal@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-12-03 03:38:55 +00:00
Didier Durand	fae6943068	[Doc]: fixing typos in multiple files. (#29685 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-28 08:41:41 -08:00
Matthew Bonanni	430dd4d9eb	[Attention] Remove imports from `vllm/attention/__init__.py` (#29342 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-11-26 10:53:15 -07:00
George D. Torres	56531b79cc	[Misc] Add backup hash algorithm for FIPS constrained environments (#28795 ) Signed-off-by: George D. Torres <gdavtor@gmail.com> Signed-off-by: George D. Torres <41129492+geodavic@users.noreply.github.com> Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: Russell Bryant <rbryant@redhat.com>	2025-11-26 00:50:22 +00:00
Ilya Markov	e7d776273d	[Compile] Refactor. Move PostGradPassManager out of Compilation config (#29340 ) Signed-off-by: ilmarkov <markovilya197@gmail.com>	2025-11-25 19:58:56 +00:00
Icey	888152bf87	Allow oot custom compiler extension via CompilerInterface (#28623 ) Signed-off-by: wxsIcey <1790571317@qq.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-11-25 15:25:15 +08:00
Laith Sakka	7a228b5305	Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. (#26199 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-11-24 10:12:41 -05:00
Yanan Cao	933f67ecd8	[Bugfix]Fix a conditional to not check zero value (#28754 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-21 19:59:07 -08:00
zhrrr	a982f5b5ea	[kernel][perf] support uncontiguous input for rms_norm kernel (#28103 ) Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com> Signed-off-by: izhuhaoran <izhuhaoran@qq.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-11-20 19:39:09 -08:00
Driss Guessous	3fd74189db	Fixes bench (#29058 ) Signed-off-by: drisspg <drisspguessous@gmail.com>	2025-11-20 21:21:54 +00:00
vnadathur	1ffe934c8a	[torch.compile] caching of config fields should be opt-out by default (#26468 ) Signed-off-by: vnadathur <glvikramn@gmail.com> Signed-off-by: WorldExplored <srreyansh.sethi@gmail.com> Signed-off-by: Srreyansh Sethi <srreyansh.sethi@gmail.com> Signed-off-by: Srreyansh Sethi <107075589+WorldExplored@users.noreply.github.com> Co-authored-by: WorldExplored <srreyansh.sethi@gmail.com> Co-authored-by: Srreyansh Sethi <107075589+worldexplored@users.noreply.github.com> Co-authored-by: vnadathur <236933696+vnadathur@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-19 06:13:54 -08:00
Kunshang Ji	1b82fb0ad3	[XPU] work around for sp, avoid custom op import error (#28822 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-11-17 13:16:44 +00:00
Didier Durand	2bb4435cb7	[Doc]: fix typos in various files (#28567 ) Signed-off-by: Didier Durand <durand.didier@gmail.com>	2025-11-15 19:27:50 +00:00
Angela Yi	f36292dbee	[compile] Enable sequence parallelism matching w/o custom ops enabled (#27126 ) Signed-off-by: angelayi <yiangela7@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ProExpertProg <lgovedic@redhat.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com>	2025-11-15 11:46:12 +00:00
Laith Sakka	2e0ad629b0	Avoid bytecode hook and simplify TorchCompileWrapperWithCustomDipatch (#25110 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-11-14 14:11:10 -08:00
Yanan Cao	262d263f6c	[Bugfix] Eliminate tuple inputs to submodules in graph partitioning (#28533 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-13 15:09:05 -05:00
Yanan Cao	48c879369f	[Frontend] Change CompilationMode to a proper Enum (#28165 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-11 19:46:18 -05:00
zhrrr	68c09efc37	[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165 ) Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>	2025-11-11 12:00:31 -05:00
Ilya Markov	d17ecc6b19	[PERF] Allreduce fusion. Support torch native matching. Tuning of the thresholds (#24248 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-11-10 18:33:11 -05:00
Boyuan Feng	b158df2813	remove resolve_op_overloads and use splitting_ops directly (#28081 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-11-08 01:13:13 +00:00
gmagogsfm	002b07c4b2	[Bugfix] vLLM should check Inductor config for compile cache enablement status (#27637 ) Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>	2025-11-05 12:22:44 -05:00
Boyuan Feng	6ab183813c	[Graph Partition][Cache] Use inductor partition ops config (#27702 ) Signed-off-by: Boyuan Feng <boyuan@meta.com>	2025-11-05 13:04:48 +00:00
ahao-anyscale	cac4c10ef0	[BUG] Make 'binary' default option for saving torch compile artifacts when using standalone_compile (#27616 ) Signed-off-by: ahao-anyscale <ahao@anyscale.com>	2025-11-03 11:13:51 -05:00
Lucas Kabela	94666612a9	[Misc][qwen2_5_vl][torch.compile] Enable `supports_torch_compile` on generic nn.Module and demonstrate speedup on Qwen Vision model (#23207 ) Signed-off-by: Lucas Kabela <lucaskabela@meta.com> Signed-off-by: Lucas Kabela <lucasakabela@gmail.com>	2025-10-28 22:36:43 +00:00
Zhengxu Chen	e3d8186666	[compile] Add fallback path to AOT compile when serialization fails. (#27350 ) Signed-off-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-28 12:54:26 -04:00
Zhengxu Chen	a00d6254e9	[compile] Disable dynamo guards check for AOT compilation. (#27288 ) Signed-off-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-28 12:58:12 +00:00
Yeshwanth N	71b1c8b667	[Chore]:Extract math and argparse utilities to separate modules (#27188 ) Signed-off-by: Yeshwanth Surya <yeshsurya@gmail.com> Signed-off-by: Yeshwanth N <yeshsurya@gmail.com> Signed-off-by: yeshsurya <yeshsurya@gmail.com>	2025-10-26 04:03:32 -07:00
Wentao Ye	52efc34ebf	[Log] Optimize Startup Log (#26740 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-10-24 19:27:04 -04:00
dongbo910220	a0003b56b0	[Chore] Separate out system utilities from vllm.utils (#27201 ) Signed-off-by: dongbo910220 <1275604947@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-22 20:25:25 +00:00
Jiangyun Zhu	ab3e80042e	[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled (#27146 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>	2025-10-22 00:22:39 -04:00
Isotr0py	6ac5e06f7c	[Chore] Clean up pytorch helper functions in `vllm.utils` (#26908 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: isotr0py <2037008807@qq.com>	2025-10-18 09:48:22 -07:00
Luka Govedič	bd7157a071	[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-17 08:10:23 -06:00
Harry Mellor	6c9fdbf725	[Docs] Replace `rst` style double-backtick with `md` single-backtick (#27091 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-17 02:47:34 -07:00
Cyrus Leung	4d4d6bad19	[Chore] Separate out `vllm.utils.importlib` (#27022 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-17 00:48:59 +00:00
Lucia Fang	11ae016bd7	[torch.compile] Passing only necessary compilation config to inductor pass config (#27041 ) Signed-off-by: Lu Fang <fanglu@fb.com> Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>	2025-10-17 00:01:52 +00:00
Bram Wasti	7d8975de84	Deepseek-v3 Batch Invariant on 8xH100 (#26609 ) Signed-off-by: Bram Wasti <bwasti@meta.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>	2025-10-15 22:06:02 -07:00
Richard Zou	9b6504c307	[BugFix] Work around graph partition x torch.compile cache issue (#26956 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-10-15 20:06:11 -07:00
Morrison Turnansky	96b9aa5aa0	[Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): name change compilation level to compilation mode, deprecation compilation level (#26355 ) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-15 02:51:16 +00:00
Luka Govedič	2dcd12d357	[torch.compile] Fix tests for torch==2.9 inductor partition (#26116 ) Signed-off-by: ProExpertProg <lgovedic@redhat.com> Signed-off-by: Luka Govedič <lgovedic@redhat.com>	2025-10-14 19:55:02 -04:00
Angela Yi	b59dd19b55	[compile] Enable sequence parallelism for full cuda graph without specifying compile sizes (#26681 ) Signed-off-by: angelayi <yiangela7@gmail.com>	2025-10-13 18:15:34 -07:00
Morrison Turnansky	e3fdb627d9	[FrontEnd] UNREVERT CompilationConfig overhaul (#20283 ): deprecate use_inductor in favor of backend, simplify custom_ops (#26502 ) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com>	2025-10-13 22:47:16 +00:00
Harry Mellor	8fcaaf6a16	Update `Optional[x]` -> `x \| None` and `Union[x, y]` to `x \| y` (#26633 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-12 09:51:31 -07:00
Angela Yi	a25f2adee9	[compile] Add patched_fused_scaled_matmul_reduce_scatter (#26604 ) Signed-off-by: angelayi <yiangela7@gmail.com>	2025-10-11 05:44:43 -07:00
Zhengxu Chen	eef921f45e	AOT Compilation for torch.compile (Bundled) (#24274 ) Signed-off-by: zhxchen17 <zhxchen17@fb.com>	2025-10-10 19:02:11 -04:00
baonudesifeizhai	cddce79fda	[torch.compile] Make inductor partition rules respect splitting_ops #25691 (#25845 ) Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com> Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-10 16:35:28 +00:00
Jason Li	f4ba2061cf	[BugFix][torch.compile] Fix fused_scaled_matmul_reduce_scatter signature for PyTorch 2.8 (#26038 ) Signed-off-by: jasonlizhengjian <jasonlizhengjian@gmail.com> Signed-off-by: <> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-10 07:42:13 -07:00

1 2 3 4 5

208 Commits