vllm/tests at 961a5ab423f548462a01112fb2bf58f34004445f - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-29 04:27:11 +08:00

History

c0de128 961a5ab423 [Bugfix][Hardware][AMD] Consolidate FP8 min/max values into helper function

Add get_fp8_min_max() helper in quant_utils.py to centralize the
FP8 min/max value logic for ROCm fnuz dtype handling.

On ROCm with torch.float8_e4m3fnuz, using PyTorch's default finfo.max
(240.0) causes accuracy issues with dynamic quantization. The correct
value is 224.0 for fnuz dtype.

This change:
- Adds get_fp8_min_max(dtype) helper returning (fp8_min, fp8_max) tuple
- Updates input_quant_fp8.py to use the helper
- Updates fp8_utils.py per_token_group_quant_fp8() to use the helper
- Updates deep_gemm.py per_block_cast_to_fp8() to use the helper
- Updates tests/kernels/quant_utils.py to use the helper

Fixes #30360

Signed-off-by: c0de128 <kevin.mckay@outlook.com>

2025-12-24 13:20:25 -06:00

..

basic_correctness

[Attention] Update tests to remove deprecated env vars (#30563 )

2025-12-17 09:49:59 -08:00

[UX] Make vllm bench serve discover model by default and use --input-len (#30816 )

2025-12-17 01:55:30 -08:00

[ROCm][FEAT] Support AITER RMSNorm quantization fusion pass (#26575 )

2025-12-23 02:07:54 -08:00

[torch.compile] caching of config fields should be opt-out by default (#26468 )

2025-11-19 06:13:54 -08:00

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

[Attention] Update tests to remove deprecated env vars (#30563 )

2025-12-17 09:49:59 -08:00

Add --max-model-len auto to auto-fit context to available memory (#29431 )

2025-12-23 21:37:14 -08:00

[Chore][1/2] Drop v0.14 deprecations (#31285 )

2025-12-24 09:54:01 -08:00

[CI] Add Qwen3-Next-FP8 to Blackwell model tests (#31049 )

2025-12-23 17:21:50 -08:00

[Bugfix][Hardware][AMD] Consolidate FP8 min/max values into helper function

2025-12-24 13:20:25 -06:00

Fix instantiation of HfHubHTTPError in LoRA test (#30768 )

2025-12-16 08:02:24 -08:00

Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145 )" (#30199 )

2025-12-07 00:00:22 -08:00

[CI] Reorganization pooling_mteb_test (#31265 )

2025-12-24 23:36:20 +08:00

Add hidden dimension validation for multimodal embedding inputs (#30968 )

2025-12-19 07:59:36 +00:00

[v1] Add PrefixLM support to FlexAttention backend (#27938 )

2025-12-07 15:51:36 +00:00

[Frontend] Resettle pooling entrypoints (#29634 )

2025-12-01 15:30:43 +08:00

[BugFix] Fix input positions for long context with sliding window (#2088 )

2023-12-13 12:28:13 -08:00

[Misc] Fix grammar errors in comments and messages (#31115 )

2025-12-21 21:14:02 -08:00

[Chore] Adjust tokenizer import to avoid circular imports (#30601 )

2025-12-13 04:42:39 -08:00

[Bugfix] [ROCm] [AITER]: Fix aiter block quant not compatible with torch compile dynamo (#28716 )

2025-11-14 10:30:50 -08:00

[Core] Switch Flat logprob control from environment variable to SamplingParams (#28914 )

2025-11-19 02:10:02 +00:00

standalone_tests

[ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm (#28979 )

2025-12-23 22:42:30 -08:00

system_messages

…

[CI Failure] Disable mosaicml/mpt-7b and databricks/dbrx-instruct tests (#31182 )

2025-12-22 15:40:35 -08:00

Fix edge case Mistral tool parser (#30724 )

2025-12-23 14:19:58 +00:00

[Bugfix][Frontend] Prevent IndexError in MiniMax M2 tool parser during streaming extraction (#30555 )

2025-12-17 16:37:57 +08:00

[CI/Build] Move pre-commit only scripts to tools/pre_commit (#27657 )

2025-10-29 08:04:33 +00:00

[Frontend][torch.compile] CompilationConfig Overhaul (#20283 ): name change compilation level to compilation mode, deprecation compilation level (#26355 )

2025-10-15 02:51:16 +00:00

transformers_utils

[Chore]: Reorganize gguf utils funtions under transformers_utils (#29891 )

2025-12-02 17:33:23 +00:00

[Frontend] Remove confusing -O.xx flag error (#30169 )

2025-12-07 02:53:42 +00:00

[Chore] Remove unused noqas (#31263 )

2025-12-24 05:38:46 -08:00

vllm_test_utils

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

[ROCm][CI] Fix Weight Loading With Multiple GPU Tests on ROCm (#28984 )

2025-11-19 21:31:33 +00:00

__init__.py

[Small] Formatter only checks lints in changed files (#1528 )

2023-10-31 15:39:38 -07:00

ci_envs.py

[Model][0/N] Improve all pooling task | clean up (#25817 )

2025-10-13 16:44:50 +08:00

conftest.py

[Chore] Remove unused noqas (#31263 )

2025-12-24 05:38:46 -08:00

test_attention_backend_registry.py

[Bugfix] fix the alias bug of AttentionBackendEnum when register CUSTOM attention backend to vllm (#30869 )

2025-12-20 09:03:35 +08:00

test_config.py

[Deprecation] Remove deprecated task, seed and MM settings (#30397 )

2025-12-10 19:59:39 -08:00

test_embedded_commit.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_envs.py

[Perf] Enable environment cache in EngineCore to enable the feature for UniProcExecutor as well (#29289 )

2025-12-10 14:13:01 -05:00

test_inputs.py

Improve parse_raw_prompt test cases for invalid input .v2 (#30512 )

2025-12-14 11:18:41 +08:00

test_logger.py

[BugFix] respect VLLM_LOGGING_LEVEL in logger (#29761 )

2025-12-02 07:49:16 +00:00

test_logprobs.py

[Core] Switch Flat logprob control from environment variable to SamplingParams (#28914 )

2025-11-19 02:10:02 +00:00

test_outputs.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_pooling_params.py

[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524 )

2025-10-30 12:13:05 +00:00

test_regression.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_routing_simulator.py

[MoE Refactor][5/N] Isolate zero expert to LongCatFlash (#28891 )

2025-12-20 18:22:04 +00:00

test_scalartype.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_seed_behavior.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_sequence.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_triton_utils.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_version.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_vllm_port.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

utils.py

[CI] Add Qwen3-Next-FP8 to Blackwell model tests (#31049 )

2025-12-23 17:21:50 -08:00