vllm/quantization at 634a14bd7d099442e338938cc2dc456266eedaa4 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-22 07:37:10 +08:00

History

Roberto L. Castro 4fa7ce46f3

[Feature] Add SM103 (Blackwell Ultra) Support to vLLM (#30484 )

Signed-off-by: LopezCastroRoberto <robertol.c510@gmail.com>
Signed-off-by: Roberto L. Castro <38211239+LopezCastroRoberto@users.noreply.github.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>

2025-12-12 19:34:23 -08:00

__init__.py

…

fp_quant.py

[Transform] [Quantization] Add QuTLASS support to vLLM (#24440 )

2025-10-10 09:43:40 -07:00

reference_mxfp4.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_auto_round.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_blackwell_moe.py

[Feature] Add SM103 (Blackwell Ultra) Support to vLLM (#30484 )

2025-12-12 19:34:23 -08:00

test_compressed_tensors.py

[Bugfix] Make compressed-tensors MoEs respect ignored layers (#28878 )

2025-11-26 21:35:13 -05:00

test_configs.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_cpu_offload.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_cpu_wna16.py

[CPU] Refactor CPU WNA16 (#28826 )

2025-11-19 10:32:00 +08:00

test_experts_int8.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_fp8.py

[Quantization] FP8 Weight Reloading for Quantized RL Rollout (#28480 )

2025-12-09 13:54:32 -08:00

test_gptq_dynamic.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_gptq_v2.py

[Kernel] Add GPTQv2 format support for low-bit or asymmetric quantization, by adapting gptq_gemm (#26092 )

2025-10-23 23:26:13 -04:00

test_ipex_quant.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_lm_head.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_mixed_precision.py

[ROCm][Quantization] extend AMD Quark to support mixed-precision quantized model (#24239 )

2025-11-11 12:05:22 -05:00

test_modelopt.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_ptpc_fp8.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_quark.py

[Deprecation] Remove deprecated plugin and compilation fields for v0.13 release (#30396 )

2025-12-10 19:59:35 -08:00

test_register_quantization_config.py

[CI Sprint] Quantization CI Cleanup (#24130 )

2025-11-18 09:21:48 -05:00

test_rtn.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_torchao.py

[torchao] fix safetensors for sharding (#28169 )

2025-11-19 16:39:45 -08:00

utils.py

…