vllm/attention at releases/v0.11.1 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 01:05:01 +08:00

History

Matthew Bonanni b30dfa03c5

[Attention] Refactor CUDA attention backend selection logic (#24794 )

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

2025-11-11 07:40:44 -05:00

test_attention_backends_selection.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_attention_backends.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_attention_splitting.py

[Core] Simplify the Dp padding/should ubatch coordination logic (#25768 )

2025-10-07 01:57:49 +00:00

test_batch_reordering.py

[BugFix] Reordering extend logic fix (#27739 )

2025-10-29 21:39:34 -07:00

test_chunked_local_attention.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_mla_backends.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00

test_sparse_mla_backends.py

Add TP parameter to attention tests (#27683 )

2025-11-03 13:04:40 -08:00

utils.py

[Attention] Refactor CUDA attention backend selection logic (#24794 )

2025-11-11 07:40:44 -05:00