vllm/quantization at 6d1479ca4b5a3904b6c5b4a1d741dda43efdc289 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-13 11:06:48 +08:00

History

[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations (#10867 )

Signed-off-by: Sage Moore <sage@neuralmagic.com>

2025-05-01 07:59:28 -07:00

aqlm

[Kernel] fix types used in aqlm and ggml kernels to support dynamo (#7596 )

2024-08-16 14:00:11 -07:00

awq

[Kernel] Fix awq error when n is not divisable by 128 (#13227 )

2025-02-13 20:07:05 -08:00

compressed_tensors

[MISC] Replace c10::optional with std::optional (#11730 )

2025-01-05 10:20:34 +09:00

cutlass_w8a8

[Bugfix] Fix cutlass dispatch for fp8/int8 to properly invoke M<=16 c… (#16751 )

2025-04-27 19:38:42 -07:00

fp4

[NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032 )

2025-04-27 06:29:21 -07:00

fp8

[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 )

2025-03-31 04:42:18 -07:00

fused_kernels

[Bugfix] Fix numel() downcast in fused_layernorm_dynamic_per_token_quant.cu (#17316 )

2025-04-28 19:23:18 -07:00

gguf

[BugFix][ROCm] Fix GGUF MoE Dispatch Block_Dim for ROCm (#16247 )

2025-04-08 05:10:26 -07:00

gptq

Fix CUDA kernel index data type in vllm/csrc/quantization/fused_kernels/layernorm_utils.cuh +10 (#15159 )

2025-03-21 10:01:11 +08:00

gptq_allspark

pre-commit autoupdate (#17380 )

2025-04-29 06:46:55 -07:00

gptq_marlin

pre-commit autoupdate (#17380 )

2025-04-29 06:46:55 -07:00

machete

add cutlass support for blackwell fp8 gemm (#13798 )

2025-03-04 07:55:07 -08:00

marlin

pre-commit autoupdate (#17380 )

2025-04-29 06:46:55 -07:00

activation_kernels.cu

[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations (#10867 )

2025-05-01 07:59:28 -07:00

utils.cuh

[Feature][ROCm]Enable fusion pass for torch.compile on ROCm (#15050 )

2025-03-31 04:42:18 -07:00

vectorization.cuh

dynamic distpatch of fp8 kernels (#14245 )

2025-03-11 10:54:56 -04:00