Junhao Li
|
3303f134e0
|
[Kernel] Add support for block FP8 on SM120 (NVIDIA 5090 and RTX PRO 6000) (#22131)
Signed-off-by: Junhao Li <junhao@ubicloud.com>
|
2025-08-07 19:18:28 -07:00 |
|
lyrisz
|
c6c9122d50
|
[Kernel] SM90 CUTLASS FP8 GEMM: add support for swap AB + kernel tuning (#20396)
Signed-off-by: Faqin Zhong <faqin.zhong@gmail.com>
Co-authored-by: Duncan Moss <djm.moss@gmail.com>
|
2025-07-28 23:13:58 +00:00 |
|
Duncan Moss
|
3d184b95b8
|
[feat]: CUTLASS block scaled group gemm for SM100 (#19757)
Signed-off-by: Duncan Moss <djm.moss@gmail.com>
Co-authored-by: Duncan Moss <dmoss@nvidia.com>
|
2025-07-04 12:58:04 -06:00 |
|
Joonchen Liau
|
9e5552aa13
|
[NVIDIA] Support Cutlass w8a8 FP8 for Blackwell Geforce GPUs (sm120) (#17280)
Signed-off-by: kaln27 <liaojuncheng123@foxmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-07-02 06:47:19 -06:00 |
|
Ilya Markov
|
2d7779f888
|
[Perf] SM100 FP8 GEMM Optimizations after cutlass_profiler (#20071)
Signed-off-by: ilmarkov <imarkov@redhat.com>
Co-authored-by: ilmarkov <imarkov@redhat.com>
|
2025-06-26 20:50:09 -07:00 |
|
Ilya Markov
|
e13945f9dd
|
[Perf] Further tunings for SM100 FP8 CUTLASS kernel (#19566)
|
2025-06-14 17:25:10 -07:00 |
|
Michael Goin
|
53a5a0ce30
|
[Perf] Tunings for SM100 FP8 CUTLASS kernel (#18778)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-06-04 10:46:28 -07:00 |
|
Lain
|
5f2cd251d2
|
Sm100 blockwise fp8 swap ab (#18564)
|
2025-06-04 07:48:45 -07:00 |
|
Lain
|
e23564cb70
|
use ceil_div in cutlass block scaling shape check (#17918)
|
2025-05-16 03:02:58 -07:00 |
|
Shu Wang
|
376786fac1
|
Add cutlass support for blackwell fp8 blockwise gemm (#14383)
Signed-off-by: Shu Wang <shuw@nvidia.com>
|
2025-05-08 15:09:55 -07:00 |
|
kushanam
|
f89978ad7c
|
add cutlass support for blackwell fp8 gemm (#13798)
|
2025-03-04 07:55:07 -08:00 |
|
leoneo
|
839b27c6cc
|
[Kernel]Add streamK for block-quantized CUTLASS kernels (#12978)
|
2025-02-20 22:14:24 -08:00 |
|
Tyler Michael Smith
|
c1e37bf71b
|
[Kernel][Bugfix] Refactor and Fix CUTLASS 2:4 Sparse Kernels (#13198)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-02-14 00:01:14 +00:00 |
|
Lucas Wilkinson
|
9798b2fb00
|
[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868)
|
2025-01-30 18:33:00 -08:00 |
|