vllm/quantization at 90189c71a9629cf2c866b213d6d28b08937a7566 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-12 04:57:10 +08:00

History

Support using Int4PreshuffledTensor after loading (#26066 )

Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>

2025-11-04 06:00:57 -05:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

fp_quant.py

[Transform] [Quantization] Add QuTLASS support to vLLM (#24440 )

2025-10-10 09:43:40 -07:00

reference_mxfp4.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_auto_round.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_blackwell_moe.py

[BugFix][Performance] Restore flashinfer autotuning for all scenarios (#27904 )

2025-11-04 15:56:21 +08:00

test_compressed_tensors.py

[1/N][Platform] Cleanup useless function (#26982 )

2025-10-22 09:04:57 +00:00

test_configs.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_cpu_offload.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_experts_int8.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_fp8.py

[CI Failure] Fix test_kv_cache_model_load_and_run (#27717 )

2025-10-30 12:27:53 +00:00

test_gptq_dynamic.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_gptq_v2.py

[Kernel] Add GPTQv2 format support for low-bit or asymmetric quantization, by adapting gptq_gemm (#26092 )

2025-10-23 23:26:13 -04:00

test_ipex_quant.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_lm_head.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_modelopt.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_ptpc_fp8.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_quark.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_register_quantization_config.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_rtn.py

[CI] Prune Quantization Tests and skip compilation (#27038 )

2025-10-16 17:26:35 -04:00

test_torchao.py

Support using Int4PreshuffledTensor after loading (#26066 )

2025-11-04 06:00:57 -05:00

utils.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00