vllm/attention at c34c82b7fe5f62e771334bdafc0c4559856ce58f - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-03 07:24:26 +08:00

History

[NVIDIA][torch.compile] Support Flashinfer TRTLLM FP8-q/kv NVFP4-out Attention Kernel (#22703 )

Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2025-08-22 22:09:05 +00:00

conftest.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_aiter_flash_attn.py

[CI Perf] Prune tests in tests/kernels/attention/ (#22936 )

2025-08-14 21:34:53 -06:00

test_attention_selector.py

[V0 Deprecation] Remove V0 FlashInfer attention backend (#22776 )

2025-08-18 19:54:16 -07:00

test_attention.py

[CI Perf] Prune tests in tests/kernels/attention/ (#22936 )

2025-08-14 21:34:53 -06:00

test_cache.py

[Kernel] Add FP8 support with FlashMLA backend (#22668 )

2025-08-22 02:26:32 +00:00

test_cascade_flash_attn.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_encoder_decoder_attn.py

[CI] change spell checker from codespell to typos (#18711 )

2025-06-11 19:57:10 -07:00

test_flash_attn.py

[CI Perf] Prune tests in tests/kernels/attention/ (#22936 )

2025-08-14 21:34:53 -06:00

test_flashinfer_trtllm_attention.py

[NVIDIA][torch.compile] Support Flashinfer TRTLLM FP8-q/kv NVFP4-out Attention Kernel (#22703 )

2025-08-22 22:09:05 +00:00

test_flashinfer.py

[Core] Always use tensor cores for Flashinfer Decode Wrapper (#23214 )

2025-08-21 16:02:11 -04:00

test_flashmla.py

[Kernel] Add FP8 support with FlashMLA backend (#22668 )

2025-08-22 02:26:32 +00:00

test_lightning_attn.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_merge_attn_states.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_mha_attn.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_mla_decode_cpu.py

[Refactor] Remove duplicate ceil_div (#20023 )

2025-06-25 05:19:09 +00:00

test_prefix_prefill.py

[CI Perf] Prune tests in tests/kernels/attention/ (#22936 )

2025-08-14 21:34:53 -06:00

test_rocm_attention_selector.py

[V0 Deprecation] Deprecate BlockSparse Attention & Phi3-Small (#21217 )

2025-07-19 13:53:17 -07:00

test_triton_decode_attention.py

[Refactor] Remove duplicate ceil_div (#20023 )

2025-06-25 05:19:09 +00:00

test_triton_unified_attention.py

[CI Perf] Prune tests in tests/kernels/attention/ (#22936 )

2025-08-14 21:34:53 -06:00