vllm/model_executor at 767cbb011dfa6e7685739dd29b68cde0727e459a - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-02 23:51:19 +08:00

History

Wentao Ye c1ffcb55da

[Refactor] Optimize FP8 MOE Backend Choice and Log (#26044 )

Signed-off-by: yewentao256 <zhyanwentao@126.com>

2025-10-03 15:23:42 -06:00

..

[Refactor] Optimize FP8 MOE Backend Choice and Log (#26044 )

2025-10-03 15:23:42 -06:00

[Quantization/NVFP4] Speed up TRTLLM NVFP4 MOE weight loading and fix K/V scale loading for MLA Attn (#25968 )

2025-10-03 19:35:06 +00:00

[BugFix][QWEN-VL]fix wrong apply_rotary_emb_torch selection introduced by #24642 (#26123 )

2025-10-03 08:52:26 -07:00

[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489 )

2025-09-25 17:37:50 +00:00

__init__.py

[V0 Deprecation] Remove V0 sampling metadata (#25345 )

2025-09-21 10:37:11 -07:00

custom_op.py

[V0 deprecation] Deprecate V0 Neuron backend (#21159 )

2025-09-06 16:15:18 -07:00

parameter.py

Revert "[Bug] Dynamo Unsupported due to BasevLLMParameter.torch_function calling disabled super()" (#25681 )

2025-09-25 09:45:06 -07:00

utils.py

[OOT] Support sync_model_loading for OOT (#25126 )

2025-09-19 05:41:53 +00:00