vllm/cpu at 899e2ef558e7345b99bc0d53c2e1c60ffdca7470 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-31 04:34:25 +08:00

History

[CPU] Update torch 2.9.1 for CPU backend (#29664 )

Signed-off-by: jiang1.li <jiang1.li@intel.com>

2025-11-28 13:37:54 +00:00

micro_gemm

[CPU] Refactor CPU WNA16 (#28826 )

2025-11-19 10:32:00 +08:00

sgl-kernels

[Doc]: fix typos in various files (#24726 )

2025-09-12 06:43:12 -07:00

activation.cpp

…

cpu_attn_amx.hpp

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

cpu_attn_impl.hpp

[CPU][IBM Z] Fix BF16 support and vectorize math operations for s390x (#28926 )

2025-11-24 12:08:09 +00:00

cpu_attn_macros.h

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

cpu_attn_neon.hpp

[perf][cpu] Accelerate paged attention GEMMs (QK, PV) on Arm CPUs with NEON (#29193 )

2025-11-22 09:04:36 -08:00

cpu_attn_vec16.hpp

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

cpu_attn_vec.hpp

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

cpu_attn.cpp

[perf][cpu] Accelerate paged attention GEMMs (QK, PV) on Arm CPUs with NEON (#29193 )

2025-11-22 09:04:36 -08:00

cpu_types_arm.hpp

[Hardware][CPU] Vllm int8 quantization enablement for ARM CPU (#14129 )

2025-07-10 15:59:04 +00:00

cpu_types_scalar.hpp

refactor(cpu_types_scalar.hpp): Unify scalar loop implementations using unroll_loop (#28847 )

2025-11-19 11:05:44 +00:00

cpu_types_vsx.hpp

[Hardware][Power] Enable compressed tensor W8A8 INT8 quantization for POWER (#17153 )

2025-05-07 22:35:03 -07:00

cpu_types_vxe.hpp

[CPU][IBM Z] Fix BF16 support and vectorize math operations for s390x (#28926 )

2025-11-24 12:08:09 +00:00

cpu_types_x86.hpp

[CPU] Refactor CPU WNA16 (#28826 )

2025-11-19 10:32:00 +08:00

cpu_types.hpp

[Hardware][RISC-V] Add riscv64 support for vLLM with scalar (#22112 )

2025-09-25 20:46:11 +08:00

cpu_wna16.cpp

[CPU] Refactor CPU WNA16 (#28826 )

2025-11-19 10:32:00 +08:00

dnnl_helper.cpp

[CPU] Refactor CPU WNA16 (#28826 )

2025-11-19 10:32:00 +08:00

dnnl_helper.h

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

dnnl_kernels.cpp

[cpu][perf] Accelerate unquantized-linear for AArch64 through oneDNN/ACL and weight prepack (#25948 )

2025-10-04 12:16:38 +08:00

float_convert.hpp

[Hardware][RISC-V] Add riscv64 support for vLLM with scalar (#22112 )

2025-09-25 20:46:11 +08:00

layernorm.cpp

…

mla_decode.cpp

[Kernel][CPU] CPU MLA (#14744 )

2025-03-25 09:34:59 +00:00

pos_encoding.cpp

Make key optional for rotary embedding (#17566 )

2025-05-07 00:11:46 -07:00

scratchpad_manager.cpp

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

scratchpad_manager.h

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

shm.cpp

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

torch_bindings.cpp

cleanup at::Tag::needs_fixed_stride_order (#28974 )

2025-11-20 02:51:36 -08:00

utils.cpp

[CPU] Update torch 2.9.1 for CPU backend (#29664 )

2025-11-28 13:37:54 +00:00

utils.hpp

[CI/Build] Fix broken build on Apple M1 (#28999 )

2025-11-19 11:07:22 +00:00