vllm/attention at a0d74ebf7f357b6ae281f921e049b70ef324af89 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-18 22:37:00 +08:00

History

Matthew Bonanni 4c23690f43

[Attention] FlashAttention ViT support, make default backend (#28763 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

2025-11-18 20:06:21 -08:00

conftest.py

[Chore] Clean up pytorch helper functions in vllm.utils (#26908 )

2025-10-18 09:48:22 -07:00

test_aiter_flash_attn.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_attention_selector.py

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

test_attention.py

[Chore] Separate out vllm.utils.mem_utils (#27143 )

2025-10-18 10:06:59 +00:00

test_cache.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_cascade_flash_attn.py

[BugFix] Fix FA3 IMA with FULL_AND_PIECEWISE and cascade attention (default) (#28702 )

2025-11-14 12:19:22 +00:00

test_cpu_attn.py

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

test_cutlass_mla_decode.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_deepgemm_attention.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_flash_attn.py

[Attention] FlashAttention ViT support, make default backend (#28763 )

2025-11-18 20:06:21 -08:00

test_flashinfer_mla_decode.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_flashinfer_trtllm_attention.py

Update Flashinfer from v0.4.1 to v0.5.2 (#27952 )

2025-11-07 16:24:42 -08:00

test_flashinfer.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_flashmla_sparse.py

[Misc] Clean up cruft from previous FlashMLA sparse implementation (#26125 )

2025-10-08 10:09:34 +08:00

test_flashmla.py

[Misc] Clean up cruft from previous FlashMLA sparse implementation (#26125 )

2025-10-08 10:09:34 +08:00

test_lightning_attn.py

Fix per file ruff ignores related to simplification (#26259 )

2025-10-05 20:31:53 +00:00

test_merge_attn_states.py

Replace torch.cuda.Event with torch.Event for better hardware compatibility (#26985 )

2025-11-18 11:34:36 -08:00

test_mha_attn.py

[Attention] FlashAttention ViT support, make default backend (#28763 )

2025-11-18 20:06:21 -08:00

test_mla_decode_cpu.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_pack_unpack_triton.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_prefix_prefill.py

[CI/Build] Refactor Attention backend for test_prefix_prefill from xformers to SDPA (#28424 )

2025-11-12 01:09:47 +08:00

test_rocm_attention_selector.py

[Attention] Implement universal BACKEND_MAP (#25900 )

2025-10-08 12:00:25 -07:00

test_triton_decode_attention.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_triton_unified_attention.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00