5 Commits

Author SHA1 Message Date
Lucas Wilkinson
d31a647124
[BugFix] Fix import error on non-blackwell machines (#21020)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-07-15 22:27:29 -07:00
Alexander Matveev
8cdc371217
SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
2025-07-15 01:06:38 +00:00
Tyler Michael Smith
e8c3bd2cd1
[Bugfix] Fix some narrowing conversion warnings (#20141)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-06-27 09:01:28 -07:00
Kaixi Hou
41aa578428
[NVIDIA] Add Cutlass MLA backend (#17625) 2025-06-03 21:40:26 -07:00
Kaixi Hou
ed7a29d9f8
[NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032)
Signed-off-by: kaixih <kaixih@nvidia.com>
2025-04-27 06:29:21 -07:00