vllm/attention at 9e7e5baaa83b1e5070a3cf3823c134b28eaa2a1c - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-03-31 10:37:03 +08:00

History

[Misc] parametrize 'dtype' in test_flash_mla (#22641 )

Signed-off-by: RUTHLESS-BOT <wujiafeng@cmbchina.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

2025-08-12 16:31:48 -04:00

conftest.py

…

test_aiter_flash_attn.py

[ROCm][AITER] Enable fp8 kv cache on rocm aiter backend. (#20295 )

2025-07-25 06:50:21 -07:00

test_attention_selector.py

[UX] Fail if an invalid attention backend is specified (#22217 )

2025-08-04 23:54:52 -07:00

test_attention.py

…

test_cache.py

…

test_cascade_flash_attn.py

…

test_encoder_decoder_attn.py

…

test_flash_attn.py

…

test_flashinfer_trtllm_attention.py

[NVIDIA] Support Flashinfer TRT-LLM Prefill Attention Kernel (#22095 )

2025-08-05 02:45:34 -07:00

test_flashinfer.py

[Misc] Add sliding window to flashinfer test (#21282 )

2025-07-21 08:37:49 -07:00

test_flashmla.py

[Misc] parametrize 'dtype' in test_flash_mla (#22641 )

2025-08-12 16:31:48 -04:00

test_lightning_attn.py

…

test_merge_attn_states.py

…

test_mha_attn.py

…

test_mla_decode_cpu.py

…

test_prefix_prefill.py

…

test_rocm_attention_selector.py

…

test_triton_decode_attention.py

…

test_triton_unified_attention.py

…