Caleb_Du
57c22e57f9
Fix CUDA permute/unpermute for use with DeepGemm Moe ( #17934 )
...
Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
2025-07-27 07:08:00 -07:00
Wentao Ye
ce9dc02c93
[Refactor] Remove unused variables in moe_permute_unpermute_kernel.inl ( #19573 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-06-13 06:12:15 -07:00
Chiyue Wei
61059bee40
[Hardware][NVIDIA] FP4 MoE kernel optimization ( #19110 )
...
Signed-off-by: Chiyue Wei <chiyuew@nvidia.com>
Co-authored-by: Chiyue Wei <chiyuew@nvidia.com>
2025-06-05 09:48:26 -07:00
Tyler Michael Smith
6e588da0f4
[Build/CI] Fix CUDA 11.8 build ( #17679 )
...
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-22 12:13:54 -07:00
Caleb_Du
3e887d2e0c
permute/unpermute kernel for moe optimization ( #14568 )
...
Signed-off-by: Caleb_Du <Caleb_Du@zju.edu.cn>
2025-05-02 11:31:55 -07:00