vllm/cpu at a6f332d0d9ac3e795949da7703f203b6b1a42797 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-29 18:57:13 +08:00

History

Li, Jiang a6f332d0d9

[Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target (#10108 )

Signed-off-by: jiang1.li <jiang1.li@intel.com>

2024-11-07 18:42:50 +08:00

..

activation.cpp

[Kernel][CPU] Add Quick gelu to CPU (#5717 )

2024-06-21 06:39:40 +00:00

attention.cpp

[Hardware][CPU] Update torch 2.5 (#9911 )

2024-11-07 04:43:08 +00:00

cache.cpp

[Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081 )

2024-07-16 15:31:32 -07:00

cpu_types_vsx.hpp

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

cpu_types_x86.hpp

[Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target (#10108 )

2024-11-07 18:42:50 +08:00

cpu_types.hpp

Support CPU inference with VSX PowerPC ISA (#5652 )

2024-06-26 21:53:04 +00:00

dnnl_helper.hpp

[Hardware][CPU] Update torch 2.5 (#9911 )

2024-11-07 04:43:08 +00:00

layernorm.cpp

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

pos_encoding.cpp

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

quant.cpp

[Hardware][CPU] Update torch 2.5 (#9911 )

2024-11-07 04:43:08 +00:00

torch_bindings.cpp

[Hardware][CPU] compressed-tensor INT8 W8A8 AZP support (#9344 )

2024-10-17 12:21:04 -04:00

utils.cpp

[Hardware][Intel] Support compressed-tensor W8A8 for CPU backend (#7257 )

2024-09-11 09:46:46 -07:00