vllm/compile at 7dbe6d81d6f17abe93389d97d417e4886467546f - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-23 03:17:12 +08:00

History

Matthew Bonanni b30dfa03c5

[Attention] Refactor CUDA attention backend selection logic (#24794 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2025-11-11 07:40:44 -05:00

piecewise

Remove last level references not removed in #26355 (#27260 )

2025-10-22 09:18:17 +00:00

__init__.py

[torch.compile] register allreduce operations as custom ops (#8526 )

2024-09-16 22:57:57 -07:00

backend.py

[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 )

2025-10-17 08:10:23 -06:00

silly_attention.py

[Chore] Clean up pytorch helper functions in vllm.utils (#26908 )

2025-10-18 09:48:22 -07:00

test_aot_compile.py

Remove last level references not removed in #26355 (#27260 )

2025-10-22 09:18:17 +00:00

test_async_tp.py

[Chore] Separate out system utilities from vllm.utils (#27201 )

2025-10-22 20:25:25 +00:00

test_basic_correctness.py

[Chore] Clean up pytorch helper functions in vllm.utils (#26908 )

2025-10-18 09:48:22 -07:00

test_config.py

remove resolve_op_overloads and use splitting_ops directly (#28081 )

2025-11-08 01:13:13 +00:00

test_decorator.py

[Chore] Clean up pytorch helper functions in vllm.utils (#26908 )

2025-10-18 09:48:22 -07:00

test_full_graph.py

[Bugfix] Ensure calculated KV scales are applied in attention. (#27232 )

2025-11-10 23:42:37 +00:00

test_functionalization.py

[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 )

2025-10-17 08:10:23 -06:00

test_fusion_all_reduce.py

[Chore] Separate out system utilities from vllm.utils (#27201 )

2025-10-22 20:25:25 +00:00

test_fusion_attn.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_fusion.py

[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 )

2025-10-17 08:10:23 -06:00

test_fusions_e2e.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_multimodal_compile.py

[Multimodal][torch.compile] Add compilation config field for turning off ViT/MM compile (#28242 )

2025-11-07 00:16:03 +00:00

test_noop_elimination.py

[Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): name change compilation level to compilation mode, deprecation compilation level (#26355 )

2025-10-15 02:51:16 +00:00

test_pass_manager.py

[torch.compile] Enable attention and allreduce fusion without custom ops enabled (#24604 )

2025-10-17 08:10:23 -06:00

test_sequence_parallelism.py

[Chore] Separate out system utilities from vllm.utils (#27201 )

2025-10-22 20:25:25 +00:00

test_silu_mul_quant_fusion.py

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled (#27146 )

2025-10-22 00:22:39 -04:00

test_wrapper.py

[Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): name change compilation level to compilation mode, deprecation compilation level (#26355 )

2025-10-15 02:51:16 +00:00