vllm/model_executor at 6160ba4151084c78164a0f472ce4da04067f9705 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-26 14:07:16 +08:00

History

Duncan Moss 6160ba4151

feat: BF16 FlashInfer Fused Cutlass MOE for Hopper and Blackwell Expert Parallel (#25503 )

Signed-off-by: Duncan Moss <djm.moss@gmail.com>

2025-09-24 18:50:04 -04:00

..

feat: BF16 FlashInfer Fused Cutlass MOE for Hopper and Blackwell Expert Parallel (#25503 )

2025-09-24 18:50:04 -04:00

[Docs] Enable fail_on_warning for the docs build in CI (#25580 )

2025-09-24 19:30:33 +00:00

[Docs] Enable fail_on_warning for the docs build in CI (#25580 )

2025-09-24 19:30:33 +00:00

[Bug] Fix AttributeError: 'FusedMoE' object has no attribute 'w13_weight_scale'. Did you mean: 'w13_weight_scale_inv' (#25519 )

2025-09-24 00:07:51 +00:00

__init__.py

[V0 Deprecation] Remove V0 sampling metadata (#25345 )

2025-09-21 10:37:11 -07:00

custom_op.py

[V0 deprecation] Deprecate V0 Neuron backend (#21159 )

2025-09-06 16:15:18 -07:00

parameter.py

[Core] Support weight_loader_v2 for UnquantizedLinearMethod (#23036 )

2025-09-23 18:30:26 -06:00

utils.py

[OOT] Support sync_model_loading for OOT (#25126 )

2025-09-19 05:41:53 +00:00