vllm/quantization at 6d8246aaffff3ebec84767e373212a7b8da328e2 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-09 18:45:54 +08:00

History

haoyangli-amd ca2d1925ef

[Rocm] [quantization] Fix quark ptpc moe and add test case (#24649 )

Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
Co-authored-by: Haoyang Li <haoyang.li@amd.com>

2025-09-16 22:15:13 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

reference_mxfp4.py

[Feature][Quantization] MXFP4 support for MOE models (#17888 )

2025-07-09 13:19:02 -07:00

test_auto_round.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_compressed_tensors.py

[Transform] [Quantization] Add transforms to compressed tensors (#22486 )

2025-08-28 02:43:48 -04:00

test_configs.py

[Kernel/Quant] Remove the original marlin format and qqq (#23204 )

2025-08-20 15:13:36 -04:00

test_cpu_offload.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_experts_int8.py

Update transformers to v4.55 (#21931 )

2025-08-05 22:56:14 -07:00

test_fp8.py

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

test_gptq_dynamic.py

[Misc] unify variable for LLM instance (#20996 )

2025-07-21 12:18:33 +01:00

test_ipex_quant.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_lm_head.py

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

test_modelopt.py

fix some typos (#24071 )

2025-09-02 20:44:50 -07:00

test_ptpc_fp8.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_quark.py

[Rocm] [quantization] Fix quark ptpc moe and add test case (#24649 )

2025-09-16 22:15:13 -07:00

test_register_quantization_config.py

[Misc] unify variable for LLM instance (#20996 )

2025-07-21 12:18:33 +01:00

test_rtn.py

[Feature] Add support for MoE models in the calibration-free RTN-based quantization (#20766 )

2025-07-25 18:09:34 -07:00

test_torchao.py

[torchao] Support quantization configs using module swap (#21982 )

2025-09-10 23:53:24 -07:00

utils.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00