vllm/moe at 18dd5e01f207e67c5c9999709327accf45b44da6 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-08 18:27:17 +08:00

History

Jinzhen Lin 1d0c9d6b2d

[Kernel] some optimizations for dense marlin and moe marlin (#16850 )

Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

2025-05-05 09:39:30 -07:00

..

pre-commit autoupdate (#17380 )

2025-04-29 06:46:55 -07:00

marlin_moe_wna16

[Kernel] some optimizations for dense marlin and moe marlin (#16850 )

2025-05-05 09:39:30 -07:00

permute_unpermute_kernels

permute/unpermute kernel for moe optimization (#14568 )

2025-05-02 11:31:55 -07:00

marlin_moe_ops.cu

[Bugfix] Fix support for dimension like integers and ScalarType (#9299 )

2024-10-17 19:08:34 +00:00

moe_align_sum_kernels.cu

Optimize moe_align_block_size for deepseek_v3 (#12850 )

2025-02-13 18:43:37 -05:00

moe_ops.h

[ROCm][Bugfix] Ensure that the moe_wna16_gemm kernel is not built on ROCm platforms. (#14629 )

2025-03-12 08:00:28 -04:00

moe_permute_unpermute_op.cu

permute/unpermute kernel for moe optimization (#14568 )

2025-05-02 11:31:55 -07:00

moe_wna16_utils.h

pre-commit autoupdate (#17380 )

2025-04-29 06:46:55 -07:00

moe_wna16.cu

[BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801 )

2025-04-17 22:13:29 -07:00

topk_softmax_kernels.cu

[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )

2024-06-09 16:23:30 -04:00

torch_bindings.cpp

permute/unpermute kernel for moe optimization (#14568 )

2025-05-02 11:31:55 -07:00