xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-11 19:17:08 +08:00

Author	SHA1	Message	Date
elvischenv	dbeee3844c	[Perf] Use NVIDIA hardware-accelerated instruction for float to fp8_e4m3 quantization (#24757 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-09-13 00:16:24 -07:00
Tyler Michael Smith	6e4852ce28	[CI/Build] Suppress divide-by-zero and missing return statement warnings (#7001 )	2024-08-05 16:00:01 -04:00
Tyler Michael Smith	cbbc904470	[Kernel] Squash a few more warnings (#6914 )	2024-07-30 13:50:42 -04:00
Michael Goin	5f6d10c14c	[CI/Build] Enforce style for C++ and CUDA code with `clang-format` (#4722 )	2024-05-22 07:18:41 +00:00
Cody Yu	c833101740	[Kernel] Refactor FP8 kv-cache with NVIDIA float8_e4m3 support (#4535 )	2024-05-09 18:04:17 -06:00