Michael Goin
|
f9a4087182
|
Remove weight_scale.T special case for SM90 Block FP8 CUTLASS kernel (#28431)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-11-11 11:46:04 -05:00 |
|
Michael Goin
|
c3aea10dc8
|
[Perf] Use upstream CUTLASS for SM90 Block FP8 kernel (#23280)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-09-11 15:43:14 -07:00 |
|
Michael Goin
|
b7adf94c4a
|
Tuned H100/H200 triton fp8 block configs for fused_qkv_a_proj (#23939)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-29 10:28:35 -07:00 |
|
Michael Goin
|
a781e84ec2
|
[Perf] Tune configs for triton block fp8 gemm H100/H200 (#23748)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-08-28 11:12:53 +08:00 |
|