vllm/compile at 5eeef1b90852917b300ed67b98e341eb846ba2e9 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-23 09:07:15 +08:00

History

[Kernel] Added flashinfer fp8 per-tensor gemms (#22895 )

Signed-off-by: Julien Lin <jullin@nvidia.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>

2025-08-26 06:54:04 -07:00

piecewise

[torch.compile] Support conditional torch.compile per module (#22269 )

2025-08-20 16:52:59 +00:00

__init__.py

[torch.compile] register allreduce operations as custom ops (#8526 )

2024-09-16 22:57:57 -07:00

backend.py

[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 )

2025-06-12 08:31:04 -07:00

test_async_tp.py

[Feature] Add async tensor parallelism for scaled mm (#20155 )

2025-07-30 17:23:41 -04:00

test_basic_correctness.py

[V0 Deprecation] Remove V0 FlashInfer attention backend (#22776 )

2025-08-18 19:54:16 -07:00

test_config.py

[Bugfix] VLLM_V1 supports passing other compilation levels (#19340 )

2025-07-29 09:35:58 -04:00

test_decorator.py

[torch.compile] Support conditional torch.compile per module (#22269 )

2025-08-20 16:52:59 +00:00

test_full_graph.py

[Kernel/Quant] Remove the original marlin format and qqq (#23204 )

2025-08-20 15:13:36 -04:00

test_functionalization.py

[NVIDIA][torch.compile] Support Flashinfer TRTLLM FP8-q/kv NVFP4-out Attention Kernel (#22703 )

2025-08-22 22:09:05 +00:00

test_fusion_all_reduce.py

[CI Perf] Only test bfloat16 for tests/compile/test_fusion_all_reduce.py (#23132 )

2025-08-19 20:18:52 -06:00

test_fusion_attn.py

[NVIDIA][torch.compile] Support Flashinfer TRTLLM FP8-q/kv NVFP4-out Attention Kernel (#22703 )

2025-08-22 22:09:05 +00:00

test_fusion.py

[Kernel] Added flashinfer fp8 per-tensor gemms (#22895 )

2025-08-26 06:54:04 -07:00

test_pass_manager.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_sequence_parallelism.py

[Kernel] Added flashinfer fp8 per-tensor gemms (#22895 )

2025-08-26 06:54:04 -07:00

test_silu_mul_quant_fusion.py

[Kernel] Added flashinfer fp8 per-tensor gemms (#22895 )

2025-08-26 06:54:04 -07:00

test_wrapper.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00