vllm/fp4 at 721fb9b1818ef23c15fd176c7ea49285de544021 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-13 05:57:11 +08:00

History

Pavani Majety 0c0fdae84f

[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362 )

2025-05-09 16:24:41 -07:00

..

nvfp4_blockwise_moe_kernel.cu

[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362 )

2025-05-09 16:24:41 -07:00

nvfp4_experts_quant.cu

[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362 )

2025-05-09 16:24:41 -07:00

nvfp4_quant_entry.cu

[Hardware/NVIDIA/Kernel] Enable nvidia/DeepSeek-R1-FP4 Model (#16362 )

2025-05-09 16:24:41 -07:00

nvfp4_quant_kernels.cu

[NVIDIA] Fix an issue to use current stream for the nvfp4 quant (#13632 )

2025-02-20 22:01:48 -08:00

nvfp4_scaled_mm_entry.cu

[Kernel] Add ModelOpt FP4 Checkpoint Support (#12520 )

2025-03-12 05:13:11 +00:00

nvfp4_scaled_mm_kernels.cu

[NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032 )

2025-04-27 06:29:21 -07:00