Lu Fang
|
051da7efe3
|
Fix CUDA kernel index data type in vllm/csrc/quantization/gptq_marlin/awq_marlin_repack.cu +10 (#15160)
Signed-off-by: Lu Fang <lufang@fb.com>
Co-authored-by: Richard Barnes <rbarnes@meta.com>
|
2025-03-25 15:36:45 +08:00 |
|
Lucas Wilkinson
|
aeb37c2a72
|
[CI/Build] Per file CUDA Archs (improve wheel size and dev build times) (#8845)
|
2024-10-03 22:55:25 -04:00 |
|
bnellnm
|
73202dbe77
|
[Kernel][Misc] register ops to prevent graph breaks (#6917)
Co-authored-by: Sage Moore <sage@neuralmagic.com>
|
2024-09-11 12:52:19 -07:00 |
|
Alexander Matveev
|
396d92d5e0
|
[Kernel][Core] Add AWQ support to the Marlin kernel (#6612)
|
2024-07-21 19:41:42 -04:00 |
|