4 Commits

Author SHA1 Message Date
Wentao Ye
ce9dc02c93
[Refactor] Remove unused variables in moe_permute_unpermute_kernel.inl (#19573)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-06-13 06:12:15 -07:00
Chiyue Wei
61059bee40
[Hardware][NVIDIA] FP4 MoE kernel optimization (#19110)
Signed-off-by: Chiyue Wei <chiyuew@nvidia.com>
Co-authored-by: Chiyue Wei <chiyuew@nvidia.com>
2025-06-05 09:48:26 -07:00
Tyler Michael Smith
6e588da0f4
[Build/CI] Fix CUDA 11.8 build (#17679)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-22 12:13:54 -07:00
Caleb_Du
3e887d2e0c
permute/unpermute kernel for moe optimization (#14568)
Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
2025-05-02 11:31:55 -07:00