vllm/attention at 8bbcf8b6e7ad0cdeaef010bd834bd723f4e00445 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 22:47:25 +08:00

History

[v1] Add real sliding window calculation to FlexAttention direct BlockMask building (#26015 )

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
Co-authored-by: baonudesifeizhai <baonudesifeizhai@gmail.com>

2025-12-01 13:12:51 +00:00

test_attention_backends_selection.py

…

test_attention_backends.py

[v1] Add real sliding window calculation to FlexAttention direct BlockMask building (#26015 )

2025-12-01 13:12:51 +00:00

test_attention_splitting.py

…

test_batch_reordering.py

[BugFix] Reordering extend logic fix (#27739 )

2025-10-29 21:39:34 -07:00

test_chunked_local_attention.py

…

test_mla_backends.py

[Attention] Refactor FA block_size limitations to hybrid models only (#29084 )

2025-11-22 06:38:44 -08:00

test_rocm_attention_backends_selection.py

[Attention] Update attention imports (#29540 )

2025-11-27 11:19:09 -05:00

test_sparse_mla_backends.py

Add TP parameter to attention tests (#27683 )

2025-11-03 13:04:40 -08:00

utils.py

[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI (#29367 )

2025-11-25 07:55:09 +00:00