vllm/attention at 23472ff51cdf25c2f9c9bf9afa50a8d3cc6cc1d8 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-30 06:37:03 +08:00

History

Lucas Wilkinson cd9b9de1fb

[BugFix] Fix IMA FlashMLA full cuda-graph and DP + Update FlashMLA (#21691 )

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

2025-08-08 16:09:42 -07:00

backends

[Attention] Support multiple attention metadata builders per kv_cache_spec + proper local attention no hybrid kv cache fix (#21588 )

2025-08-06 18:40:52 -07:00

layers

[Attention] Support multiple attention metadata builders per kv_cache_spec + proper local attention no hybrid kv cache fix (#21588 )

2025-08-06 18:40:52 -07:00

ops

[BugFix] Fix IMA FlashMLA full cuda-graph and DP + Update FlashMLA (#21691 )

2025-08-08 16:09:42 -07:00

utils

[MISC] Add init files for python package (#20908 )

2025-07-15 12:16:33 +00:00

__init__.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

layer.py

[Attention] Support multiple attention metadata builders per kv_cache_spec + proper local attention no hybrid kv cache fix (#21588 )

2025-08-06 18:40:52 -07:00

selector.py

[Attention] Support multiple attention metadata builders per kv_cache_spec + proper local attention no hybrid kv cache fix (#21588 )

2025-08-06 18:40:52 -07:00