3 Commits

Author SHA1 Message Date
Jee Jee Li
7ecee34321
[Kernel][RFC] Refactor the punica kernel based on Triton (#5036) 2024-07-31 17:12:24 -07:00
Thomas Parnell
9a7e2d0534
[Bugfix] Allow vllm to still work if triton is not installed. (#6786)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2024-07-29 14:51:27 -07:00
Thomas Parnell
eaec4b9153
[Bugfix] Add custom Triton cache manager to resolve MoE MP issue (#6140)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Co-authored-by: Chih-Chieh-Yang <chih.chieh.yang@ibm.com>
2024-07-15 10:12:47 -07:00