vllm/cutlass_w8a8 at 8a87cd27d94f03068b9cbc85058636fc16222e24 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-12 19:07:11 +08:00

History

Junhao Li 3303f134e0

[Kernel] Add support for block FP8 on SM120 (NVIDIA 5090 and RTX PRO 6000) (#22131 )

Signed-off-by: Junhao Li <junhao@ubicloud.com>

2025-08-07 19:18:28 -07:00

..

[Kernel] Add support for block FP8 on SM120 (NVIDIA 5090 and RTX PRO 6000) (#22131 )

2025-08-07 19:18:28 -07:00

[Bug] Fix Compressed Tensor NVFP4 cutlass_fp4_group_mm illegal memory access (#21465 )

2025-07-24 08:13:24 -07:00

Epilogues.md

[Docs] Switch to better markdown linting pre-commit hook (#21851 )

2025-07-29 19:45:08 -07:00

scaled_mm_c2x_sm75_dispatch.cuh

[Kernel] Tuned int8 Cutlass Kernels for SM75 (T4) (#6996 )

2024-07-31 14:40:32 -07:00

scaled_mm_c2x_sm80_dispatch.cuh

[Kernel] Tuned FP8 Kernels for Ada Lovelace (#6677 )

2024-07-29 09:42:35 -06:00

scaled_mm_c2x_sm89_fp8_dispatch.cuh

[Bugfix] Fix cutlass dispatch for fp8/int8 to properly invoke M<=16 c… (#16751 )

2025-04-27 19:38:42 -07:00

scaled_mm_c2x_sm89_int8_dispatch.cuh

[Bugfix] Fix cutlass dispatch for fp8/int8 to properly invoke M<=16 c… (#16751 )

2025-04-27 19:38:42 -07:00

scaled_mm_c2x.cu

[MISC] Replace c10::optional with std::optional (#11730 )

2025-01-05 10:20:34 +09:00

scaled_mm_c2x.cuh

[Kernel][Bugfix] Refactor and Fix CUTLASS 2:4 Sparse Kernels (#13198 )

2025-02-14 00:01:14 +00:00

scaled_mm_c3x_sm90.cu

Add cutlass support for blackwell fp8 blockwise gemm (#14383 )

2025-05-08 15:09:55 -07:00

scaled_mm_c3x_sm100.cu

Add cutlass support for blackwell fp8 blockwise gemm (#14383 )

2025-05-08 15:09:55 -07:00

scaled_mm_c3x_sm120.cu

[Kernel] Add support for block FP8 on SM120 (NVIDIA 5090 and RTX PRO 6000) (#22131 )

2025-08-07 19:18:28 -07:00

scaled_mm_entry.cu

[feat]: add SM100 support for cutlass FP8 groupGEMM (#20447 )

2025-07-22 07:27:12 -07:00