vllm/rocm at db9dfcfa6a0b88fb880ee21b56f133c9c5a600ab - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-10 19:27:20 +08:00

History

Lu Fang 051da7efe3

Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160 )

Signed-off-by: Lu Fang <lufang@fb.com>
Co-authored-by: Richard Barnes <rbarnes@meta.com>

2025-03-25 15:36:45 +08:00

..

attention.cu

Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160 )

2025-03-25 15:36:45 +08:00

ops.h

[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906 )

2025-01-23 18:04:03 +00:00

torch_bindings.cpp

[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906 )

2025-01-23 18:04:03 +00:00