vllm/c3x at 094b7d9496ccbdcb15bafbdab0083e54734da2d6 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-25 18:57:15 +08:00

History

leoneo 839b27c6cc

[Kernel]Add streamK for block-quantized CUTLASS kernels (#12978 )

2025-02-20 22:14:24 -08:00

..

cutlass_gemm_caller.cuh

[Kernel]Add streamK for block-quantized CUTLASS kernels (#12978 )

2025-02-20 22:14:24 -08:00

scaled_mm_azp_sm90_int8.cu

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm_blockwise_sm90_fp8_dispatch.cuh

[Kernel]Add streamK for block-quantized CUTLASS kernels (#12978 )

2025-02-20 22:14:24 -08:00

scaled_mm_blockwise_sm90_fp8.cu

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm_kernels.hpp

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm_sm90_fp8_dispatch.cuh

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm_sm90_fp8.cu

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm_sm90_int8_dispatch.cuh

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm_sm90_int8.cu

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

scaled_mm.cuh

[Kernel][Bugfix] Refactor and Fix CUTLASS 2:4 Sparse Kernels (#13198 )

2025-02-14 00:01:14 +00:00