560 Commits

Author SHA1 Message Date
Woosuk Kwon
897cb2ae28
Optimize data movement (#20) 2023-04-02 00:30:17 -07:00
Woosuk Kwon
09e9245478
Add custom kernel for RMS normalization (#16) 2023-04-01 00:51:22 +08:00
Woosuk Kwon
88c0268a18
Implement custom kernel for LLaMA rotary embedding (#14) 2023-03-30 11:04:21 -07:00
Woosuk Kwon
cfae35b861
Add miscellaneous updates (#8) 2023-03-13 13:48:38 -07:00
Woosuk Kwon
1a7eb7da61
Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
Woosuk Kwon
0deacbce6e
Implement single_query_cached_kv_attention kernel (#3) 2023-03-01 15:02:19 -08:00
Woosuk Kwon
c413c41cda Add reshape_and_cache op 2023-02-18 19:22:57 +00:00
Woosuk Kwon
ffad4e1e03 cache_kernel -> cache_kernels 2023-02-16 20:05:45 +00:00
Woosuk Kwon
6d2f74efb3 Remove redundant fn 2023-02-16 09:24:42 +00:00
Woosuk Kwon
6f058c7ba8 Implement cache ops 2023-02-16 07:47:03 +00:00