5 Commits

Author SHA1 Message Date
Zhuohan Li
2f49f15585
Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00
Woosuk Kwon
e9d3f2ff77
Add memory analyzer & utomatically configure KV cache size (#6) 2023-03-11 23:23:14 -08:00
Woosuk Kwon
1a7eb7da61
Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
Woosuk Kwon
7b6844e590 Add input metadata 2023-02-22 19:01:20 +00:00
Woosuk Kwon
709a69176e Move worker/models -> models 2023-02-22 18:03:48 +00:00