vllm/moe at 86debab54c046232014b108d530a8c25d857e9a3 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-28 12:34:27 +08:00

History

Richard Barnes 86debab54c

Fix numel() downcast in vllm/csrc/moe/moe_align_sum_kernels.cu +2 (#17082 )

Co-authored-by: mgoin <mgoin64@gmail.com>

2025-07-01 06:48:10 +00:00

..

marlin_moe_wna16

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

permute_unpermute_kernels

[Refactor] Remove unused variables in moe_permute_unpermute_kernel.inl (#19573 )

2025-06-13 06:12:15 -07:00

moe_align_sum_kernels.cu

Fix numel() downcast in vllm/csrc/moe/moe_align_sum_kernels.cu +2 (#17082 )

2025-07-01 06:48:10 +00:00

moe_ops.h

[Perf] Optimize moe_align_block_size CUDA kernel (#19572 )

2025-06-17 11:49:26 -07:00

moe_permute_unpermute_op.cu

[CI] change spell checker from codespell to typos (#18711 )

2025-06-11 19:57:10 -07:00

moe_wna16_utils.h

pre-commit autoupdate (#17380 )

2025-04-29 06:46:55 -07:00

moe_wna16.cu

[BugFix] Accuracy fix for llama4 int4 - improperly casted scales (#16801 )

2025-04-17 22:13:29 -07:00

topk_softmax_kernels.cu

Fix numel() downcast in vllm/csrc/moe/moe_align_sum_kernels.cu +2 (#17082 )

2025-07-01 06:48:10 +00:00

torch_bindings.cpp

[Perf] Optimize moe_align_block_size CUDA kernel (#19572 )

2025-06-17 11:49:26 -07:00