vllm/model_executor at 5a87d8b9b1f357a65a9b73773178ae17fd7cd9c8 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-02 23:12:03 +08:00

History

[Bugfix] Fix grouped_topk pytorch impl when num_experts can't be grouped properly (#29439 )

Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>

2025-12-10 19:47:18 -08:00

layers

[Bugfix] Fix grouped_topk pytorch impl when num_experts can't be grouped properly (#29439 )

2025-12-10 19:47:18 -08:00

model_loader

[Model][Quantization] Restore MoE + GGUF models support (incl. Qwen3 MoE) by allowing Sideload Parameters (#30116 )

2025-12-09 05:30:05 +00:00

models

[Bugfix] Fix HunyuanOCR cross-image contamination in batch processing (#30344 )

2025-12-10 18:09:31 +00:00

warmup

[BugFix] Fix AttributeError: 'MergedColumnParallelLinear' object has no attribute 'weight_scale' (#30399 )

2025-12-10 07:59:23 -08:00

__init__.py

…

custom_op.py

…

parameter.py

…

utils.py

[Quantization] FP8 Weight Reloading for Quantized RL Rollout (#28480 )

2025-12-09 13:54:32 -08:00