vllm/quantization at 50fede6634a997f4e971ecb4eb4cce337340e394 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-22 22:47:24 +08:00

History

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

2025-08-22 10:56:57 +08:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

reference_mxfp4.py

[Feature][Quantization] MXFP4 support for MOE models (#17888 )

2025-07-09 13:19:02 -07:00

test_auto_round.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_compressed_tensors.py

[Refactor] Refactor MOE NVFP4 Code Base: ModelOpt + Compressed Tensor (#21631 )

2025-07-27 05:25:21 -07:00

test_configs.py

[Kernel/Quant] Remove the original marlin format and qqq (#23204 )

2025-08-20 15:13:36 -04:00

test_cpu_offload.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_experts_int8.py

Update transformers to v4.55 (#21931 )

2025-08-05 22:56:14 -07:00

test_fp8.py

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

test_gptq_dynamic.py

[Misc] unify variable for LLM instance (#20996 )

2025-07-21 12:18:33 +01:00

test_ipex_quant.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_lm_head.py

[Deprecation] Remove prompt_token_ids arg fallback in LLM.generate and LLM.embed (#18800 )

2025-08-22 10:56:57 +08:00

test_modelopt.py

Add Nvidia ModelOpt config adaptation (#19815 )

2025-07-21 10:02:58 -04:00

test_ptpc_fp8.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_quark.py

[Misc] unify variable for LLM instance (#20996 )

2025-07-21 12:18:33 +01:00

test_register_quantization_config.py

[Misc] unify variable for LLM instance (#20996 )

2025-07-21 12:18:33 +01:00

test_rtn.py

[Feature] Add support for MoE models in the calibration-free RTN-based quantization (#20766 )

2025-07-25 18:09:34 -07:00

test_torchao.py

Fix TorchAOConfig skip layers (#19265 )

2025-06-12 22:22:53 +08:00

utils.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00