vllm/kernel at 0b51c9bd8b19cee3a494b0f966a6b0a846a40193 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-14 16:35:40 +08:00

History

Lucas Wilkinson 1726e93ef1

[BugFix][DP/EP] Fix CUTLASS MLA hang under load (#26026 )

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>

2025-10-01 12:30:00 -07:00

sm100_fmha_mla_reduction.hpp

SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769 )

2025-07-15 01:06:38 +00:00

sm100_fmha_mla_tma_warpspecialized.hpp

[BugFix][DP/EP] Fix CUTLASS MLA hang under load (#26026 )

2025-10-01 12:30:00 -07:00

sm100_mla_tile_scheduler.hpp

SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769 )

2025-07-15 01:06:38 +00:00