9 Commits

Author SHA1 Message Date
Woosuk Kwon
e3e79e9e8a
Implement AWQ quantization support for LLaMA (#1032)
Co-authored-by: Robert Irvine <robert@seamlessml.com>
Co-authored-by: root <rirv938@gmail.com>
Co-authored-by: Casper <casperbh.96@gmail.com>
Co-authored-by: julian-q <julianhquevedo@gmail.com>
2023-09-16 00:03:37 -07:00
Zhuohan Li
2cf1a333b6
[Doc] Documentation for distributed inference (#261) 2023-06-26 11:34:23 -07:00
Zhuohan Li
a255885f83
Add logo and polish readme (#156) 2023-06-19 16:31:13 +08:00
Woosuk Kwon
376725ce74
[PyPI] Packaging for PyPI distribution (#140) 2023-06-05 20:03:14 -07:00
Woosuk Kwon
19d2899439
Add initial sphinx docs (#120) 2023-05-22 17:02:44 -07:00
Zhuohan Li
4858f3bb45
Add an option to launch cacheflow without ray (#51) 2023-04-30 15:42:17 +08:00
Woosuk Kwon
84eee24e20
Collect system stats in scheduler & Add scripts for experiments (#30) 2023-04-12 15:03:49 -07:00
Woosuk Kwon
3b41f16596 Add gitignore 2023-02-16 07:47:21 +00:00
Woosuk Kwon
0a11a2e5ca Add gitignore 2023-02-09 11:28:12 +00:00