vllm/quantization at 8c0d15d5c5658b74a70694124af2ac250fdc4e23 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-08-03 03:57:02 +08:00

History

Lu Fang 8c0d15d5c5

[Misc][Easy] Annotate unused vars in the csrc files (#14798 )

Signed-off-by: Lu Fang <lufang@fb.com>

2025-03-15 12:40:09 +08:00

..

[Kernel] fix types used in aqlm and ggml kernels to support dynamo (#7596 )

2024-08-16 14:00:11 -07:00

[Kernel] Fix awq error when n is not divisable by 128 (#13227 )

2025-02-13 20:07:05 -08:00

compressed_tensors

[MISC] Replace c10::optional with std::optional (#11730 )

2025-01-05 10:20:34 +09:00

[Build/BugFix] Fix hopper 12.8 build (#14354 )

2025-03-08 08:11:56 +00:00

[Kernel] Add ModelOpt FP4 Checkpoint Support (#12520 )

2025-03-12 05:13:11 +00:00

[Misc][Easy] Annotate unused vars in the csrc files (#14798 )

2025-03-15 12:40:09 +08:00

dynamic distpatch of fp8 kernels (#14245 )

2025-03-11 10:54:56 -04:00

[Kernel] GGUF MoE kernel (#14613 )

2025-03-12 03:33:27 +00:00

[Misc][Easy] Annotate unused vars in the csrc files (#14798 )

2025-03-15 12:40:09 +08:00

[Bugfix][Kernel]: Fix AllSpark kernel compilation errors and enable for CUDA < 12.0 (#14430 )

2025-03-14 09:55:14 -07:00

[Kernel] optimize performance of gptq marlin kernel when n is small (#14138 )

2025-03-07 11:53:38 -05:00

add cutlass support for blackwell fp8 gemm (#13798 )

2025-03-04 07:55:07 -08:00

Update pre-commit hooks (#12475 )

2025-01-27 17:23:08 -07:00

vectorization.cuh

dynamic distpatch of fp8 kernels (#14245 )

2025-03-11 10:54:56 -04:00