Lucas Wilkinson baeded2569
[Attention] Deepseek v3 MLA support with FP8 compute (#12601)
This PR implements the Deepseek V3 support by performing matrix absorption the fp8 weights 

---------

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Alexander Matveev <59768536+alexm-neuralmagic@users.noreply.github.com>
2025-01-31 21:52:51 -08:00
..
2025-01-27 17:23:08 -07:00
2025-01-27 17:23:08 -07:00
2025-01-15 02:29:53 +00:00
2025-01-27 17:23:08 -07:00
2025-01-27 17:23:08 -07:00
2025-01-27 17:23:08 -07:00