vllm/attention at 086b96339ffe057f92cd0a20c8be820b17e24dbf - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-03-21 08:48:02 +08:00

History

[MM Encoder]: Migrate legacy ViT MultiHeadAttention to new MMEncoderAttention interface (#30684 )

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>

2025-12-19 02:04:19 +08:00

conftest.py

…

test_aiter_flash_attn.py

[CI/Build][AMD] Skip if flash_attn_varlen_func not available in test_aiter_flash_attn.py (#29043 )

2025-11-20 20:39:49 +00:00

test_attention_selector.py

[Attention] Update tests to remove deprecated env vars (#30563 )

2025-12-17 09:49:59 -08:00

test_attention.py

[MM Encoder]: Migrate legacy ViT MultiHeadAttention to new MMEncoderAttention interface (#30684 )

2025-12-19 02:04:19 +08:00

test_cache.py

[Perf][Deepseek] optimize gather_and_maybe_dequant_cache kernel's perf for extremely long sequence (#28029 )

2025-11-24 19:05:46 -07:00

test_cascade_flash_attn.py

[CI/Build][AMD] Fix import errors in tests/kernels/attention (#29032 )

2025-11-20 17:48:09 +08:00

test_cpu_attn.py

[cpu][ci] Add CPU Attention Tests for Neon Backend (#30347 )

2025-12-10 05:37:35 +00:00

test_cutlass_mla_decode.py

[Feature] Add SM103 (Blackwell Ultra) Support to vLLM (#30484 )

2025-12-12 19:34:23 -08:00

test_deepgemm_attention.py

…

test_flash_attn.py

[CI/Build][AMD] Fix import errors in tests/kernels/attention (#29032 )

2025-11-20 17:48:09 +08:00

test_flashinfer_mla_decode.py

[CI/Build][AMD] Fix import errors in tests/kernels/attention (#29032 )

2025-11-20 17:48:09 +08:00

test_flashinfer_trtllm_attention.py

[Kernels][FI] Skip trtllm attention when num_kv_heads=1 (#30842 )

2025-12-17 01:54:21 -08:00

test_flashinfer.py

[CI/Build][AMD] Fix import errors in tests/kernels/attention (#29032 )

2025-11-20 17:48:09 +08:00

test_flashmla_sparse.py

…

test_flashmla.py

…

test_lightning_attn.py

…

test_merge_attn_states.py

Replace torch.cuda.Event with torch.Event for better hardware compatibility (#26985 )

2025-11-18 11:34:36 -08:00

test_mha_attn.py

[MM Encoder]: Migrate legacy ViT MultiHeadAttention to new MMEncoderAttention interface (#30684 )

2025-12-19 02:04:19 +08:00

test_mla_decode_cpu.py

…

test_pack_unpack_triton.py

…

test_prefix_prefill.py

[CI/Build] Fix test_prefix_prefill for AMD (#28905 )

2025-11-19 16:04:36 -05:00

test_rocm_attention_selector.py

[Attention] Update tests to remove deprecated env vars (#30563 )

2025-12-17 09:49:59 -08:00

test_triton_decode_attention.py

…

test_triton_unified_attention.py

[Kernel] Support CUDA Graphs in 3D Triton Attention Kernel (#28306 )

2025-12-12 16:55:40 +01:00