Lucas Wilkinson
3e41992fec
[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 ( #27532 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-12 05:57:47 -08:00
Xin Yang
a491b0911b
[LoRA] Support FusedMoE LoRA Triton kernel for mxfp4 ( #29708 )
...
Signed-off-by: Xin Yang <xyangx@amazon.com>
Signed-off-by: Xin Yang <105740670+xyang16@users.noreply.github.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-30 10:37:25 +08:00
Huamin Li
3fd1fb0b60
Revert "[LoRA] Support FusedMoE LoRA Triton kernel for mxfp4 ( #28971 )" ( #29697 )
...
Signed-off-by: Huamin Li <3ericli@gmail.com>
2025-11-28 15:26:52 -08:00
Xin Yang
745a3bae1a
[LoRA] Support FusedMoE LoRA Triton kernel for mxfp4 ( #28971 )
...
Signed-off-by: Xin Yang <xyangx@amazon.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-11-28 10:48:28 +08:00