vllm/model_executor at 081b5594a2b1a37ea793659bb6767c497beef45d - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-12 13:17:22 +08:00

History

Shu Wang 081b5594a2

Fix routing_bias dtype (#25711 )

Signed-off-by: Shu Wang. <shuw@nvidia.com>

2025-09-25 23:35:14 +00:00

..

Fix routing_bias dtype (#25711 )

2025-09-25 23:35:14 +00:00

[Optimization] Use a cheaper cache key in get_model_architecture (#25682 )

2025-09-25 17:54:20 -04:00

[Model] rename NemotronH_Nano_VL -> NemotronH_Nano_VL_V2 (#25708 )

2025-09-25 16:10:29 -07:00

[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489 )

2025-09-25 17:37:50 +00:00

__init__.py

[V0 Deprecation] Remove V0 sampling metadata (#25345 )

2025-09-21 10:37:11 -07:00

custom_op.py

[V0 deprecation] Deprecate V0 Neuron backend (#21159 )

2025-09-06 16:15:18 -07:00

parameter.py

Revert "[Bug] Dynamo Unsupported due to BasevLLMParameter.torch_function calling disabled super()" (#25681 )

2025-09-25 09:45:06 -07:00

utils.py

[OOT] Support sync_model_loading for OOT (#25126 )

2025-09-19 05:41:53 +00:00