1762 Commits

Author SHA1 Message Date
Zhuohan Li
96853af5a8
Optimize MQA Kernel (#452) 2023-07-14 20:06:40 -04:00
Wen Sun
dbed69058c
Fix the KeyError when loading bloom-based models (#441) 2023-07-13 21:58:09 -07:00
panda
7b6ae94059
add vocab padding for LLama(Support WizardLM) (#411) 2023-07-13 23:56:22 -04:00
Andre Slavescu
c894836108
[Model] Add support for GPT-J (#226)
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
2023-07-08 17:55:16 -07:00
Woosuk Kwon
404422f42e
[Model] Add support for MPT (#334) 2023-07-03 16:47:53 -07:00
Zhuohan Li
42e0c1df78
[Quality] Add CI for formatting (#343) 2023-07-03 14:50:56 -07:00
Woosuk Kwon
e41f06702c
Add support for BLOOM (#331) 2023-07-03 13:12:35 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
Zhuohan Li
598dc4b79a
[Fix] Weight loading for GPTBigCode (#313) 2023-06-29 22:14:17 -07:00
twaka
4026a049d3
expand coverage of gpt2 model loading (#271) 2023-06-27 06:27:41 -07:00
Michael Feil
298695b766
GPTBigCode (StarCoder, SantaCoder Support) (#209) 2023-06-23 01:49:27 +08:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00