vllm/utils at 59f935300c4818cb10db8a0efadb431a2f169506 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-16 02:54:30 +08:00

History

Wentao Ye 37bd8d6e4c

[Bug] DeepGemm: Fix TypeError: per_block_cast_to_fp8() missing 1 required positional argument: 'use_ue8m0' for SM100 (#21187 )

Signed-off-by: yewentao256 <zhyanwentao@126.com>

2025-07-18 23:25:22 -07:00

..

__init__.py

Support FP8 Quantization and Inference Run on Intel Gaudi (HPU) using INC (Intel Neural Compressor) (#12010 )

2025-07-16 15:33:41 -04:00

deep_gemm.py

[Bug] DeepGemm: Fix TypeError: per_block_cast_to_fp8() missing 1 required positional argument: 'use_ue8m0' for SM100 (#21187 )

2025-07-18 23:25:22 -07:00

flashinfer.py

[Core] FlashInfer CUTLASS fused MoE backend (NVFP4) (#20037 )

2025-07-17 21:32:45 -07:00