vllm/attention at ac0bb2c3075f87b62afde8ea7dab10207bb71df1 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-24 18:05:01 +08:00

History

[CPU] Refactor CPU attention backend (#27954 )

Signed-off-by: jiang1.li <jiang1.li@intel.com>

2025-11-12 09:43:06 +08:00

conftest.py

…

test_aiter_flash_attn.py

…

test_attention_selector.py

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

test_attention.py

…

test_cache.py

…

test_cascade_flash_attn.py

…

test_cpu_attn.py

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

test_cutlass_mla_decode.py

…

test_deepgemm_attention.py

…

test_flash_attn.py

[Test] Remove old non-varlen FA2 test (#28420 )

2025-11-10 23:57:41 +00:00

test_flashinfer_mla_decode.py

…

test_flashinfer_trtllm_attention.py

Update Flashinfer from v0.4.1 to v0.5.2 (#27952 )

2025-11-07 16:24:42 -08:00

test_flashinfer.py

…

test_flashmla_sparse.py

…

test_flashmla.py

…

test_lightning_attn.py

…

test_merge_attn_states.py

…

test_mha_attn.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_mla_decode_cpu.py

…

test_pack_unpack_triton.py

…

test_prefix_prefill.py

[CI/Build] Refactor Attention backend for test_prefix_prefill from xformers to SDPA (#28424 )

2025-11-12 01:09:47 +08:00

test_rocm_attention_selector.py

…

test_triton_decode_attention.py

…

test_triton_unified_attention.py

…