vllm/cutlass_extensions at 8805ad9fa9c04b2ce4e2a9adc217471798b1ae64 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-20 03:07:09 +08:00

History

Junhao Li 3303f134e0

[Kernel] Add support for block FP8 on SM120 (NVIDIA 5090 and RTX PRO 6000) (#22131 )

Signed-off-by: Junhao Li <junhao@ubicloud.com>

2025-08-07 19:18:28 -07:00

..

Replace multiply_add with homogeneous_multiply_add to Address Clang Template Parameter Issue (#20142 )

2025-07-09 00:30:18 +00:00

[feat]: CUTLASS block scaled group gemm for SM100 (#19757 )

2025-07-04 12:58:04 -06:00

common.cpp

[Kernel]: Cutlass 2:4 Sparsity + FP8/Int8 Quant Support (#10995 )

2024-12-18 09:57:16 -05:00

common.hpp

[Kernel] Add support for block FP8 on SM120 (NVIDIA 5090 and RTX PRO 6000) (#22131 )

2025-08-07 19:18:28 -07:00

cute_utils.cuh

[Kernel] Initial Machete W4A8 support + Refactors (#9855 )

2024-11-18 12:59:29 -07:00

torch_utils.hpp

[MISC] Replace c10::optional with std::optional (#11730 )

2025-01-05 10:20:34 +09:00

vllm_collective_builder.cuh

[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868 )

2025-01-30 18:33:00 -08:00

vllm_custom_types.cuh

[Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174 )

2024-08-20 07:09:33 -06:00

vllm_cutlass_library_extension.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

vllm_numeric_conversion.cuh

[Kernel] Initial Machete W4A8 support + Refactors (#9855 )

2024-11-18 12:59:29 -07:00

vllm_type_utils.cuh

[Kernel] Initial Machete W4A8 support + Refactors (#9855 )

2024-11-18 12:59:29 -07:00