vllm/attention at 76e4dcf225e4de115bdc20b00a78d49bec767c09 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-02 08:31:21 +08:00

History

[CI/Build] Refactor Attention backend for test_prefix_prefill from xformers to SDPA (#28424 )

Signed-off-by: zhewenli <zhewenli@meta.com>
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Roger Wang <hey@rogerw.io>

2025-11-12 01:09:47 +08:00

conftest.py

[Chore] Clean up pytorch helper functions in vllm.utils (#26908 )

2025-10-18 09:48:22 -07:00

test_aiter_flash_attn.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_attention_selector.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_attention.py

[Chore] Separate out vllm.utils.mem_utils (#27143 )

2025-10-18 10:06:59 +00:00

test_cache.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_cascade_flash_attn.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_cutlass_mla_decode.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_deepgemm_attention.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_flash_attn.py

[Test] Remove old non-varlen FA2 test (#28420 )

2025-11-10 23:57:41 +00:00

test_flashinfer_mla_decode.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_flashinfer_trtllm_attention.py

Update Flashinfer from v0.4.1 to v0.5.2 (#27952 )

2025-11-07 16:24:42 -08:00

test_flashinfer.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_flashmla_sparse.py

[Misc] Clean up cruft from previous FlashMLA sparse implementation (#26125 )

2025-10-08 10:09:34 +08:00

test_flashmla.py

[Misc] Clean up cruft from previous FlashMLA sparse implementation (#26125 )

2025-10-08 10:09:34 +08:00

test_lightning_attn.py

Fix per file ruff ignores related to simplification (#26259 )

2025-10-05 20:31:53 +00:00

test_merge_attn_states.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_mha_attn.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_mla_decode_cpu.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_pack_unpack_triton.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_prefix_prefill.py

[CI/Build] Refactor Attention backend for test_prefix_prefill from xformers to SDPA (#28424 )

2025-11-12 01:09:47 +08:00

test_rocm_attention_selector.py

[Attention] Implement universal BACKEND_MAP (#25900 )

2025-10-08 12:00:25 -07:00

test_triton_decode_attention.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_triton_unified_attention.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00