vllm/cpu at ccf02fcbaebb1a5b59dfc6c7cb64aa7cc489f04c - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-31 18:47:56 +08:00

History

Thien Tran 27b50f1fe6

[Bugfix][Kernel][CPU] Fix num_tokens in CPU rotary embedding kernel (#14667 )

Signed-off-by: Thien Tran <gau.nernst@yahoo.com.sg>

2025-03-13 23:47:49 -07:00

..

activation.cpp

[Kernel][CPU] Add Quick gelu to CPU (#5717 )

2024-06-21 06:39:40 +00:00

attention.cpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

cache.cpp

[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906 )

2025-01-23 18:04:03 +00:00

cpu_types_arm.hpp

[Bugfix] Explicitly include "omp.h" for MacOS to avoid installation failure (#14051 )

2025-03-02 17:35:01 -08:00

cpu_types_vsx.hpp

Move linting to pre-commit (#11975 )

2025-01-20 14:58:01 +08:00

cpu_types_vxe.hpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

cpu_types_x86.hpp

Move linting to pre-commit (#11975 )

2025-01-20 14:58:01 +08:00

cpu_types.hpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

dnnl_helper.hpp

[Hardware][CPU] Update torch 2.5 (#9911 )

2024-11-07 04:43:08 +00:00

layernorm.cpp

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

pos_encoding.cpp

[Bugfix][Kernel][CPU] Fix num_tokens in CPU rotary embedding kernel (#14667 )

2025-03-13 23:47:49 -07:00

quant.cpp

Adding cpu inference with VXE ISA for s390x architecture (#12613 )

2025-03-06 08:40:53 -08:00

torch_bindings.cpp

[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906 )

2025-01-23 18:04:03 +00:00

utils.cpp

[Hardware][Apple] Native support for macOS Apple Silicon (#11696 )

2025-01-08 16:35:49 +08:00