Zhuohan Li
|
96853af5a8
|
Optimize MQA Kernel (#452)
|
2023-07-14 20:06:40 -04:00 |
|
Andre Slavescu
|
c894836108
|
[Model] Add support for GPT-J (#226)
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
|
2023-07-08 17:55:16 -07:00 |
|
Woosuk Kwon
|
404422f42e
|
[Model] Add support for MPT (#334)
|
2023-07-03 16:47:53 -07:00 |
|
Woosuk Kwon
|
e41f06702c
|
Add support for BLOOM (#331)
|
2023-07-03 13:12:35 -07:00 |
|
Woosuk Kwon
|
0b98ba15c7
|
Change the name to vLLM (#150)
|
2023-06-17 03:07:40 -07:00 |
|
Woosuk Kwon
|
e38074b1e6
|
Support FP32 (#141)
|
2023-06-07 00:40:21 -07:00 |
|
Woosuk Kwon
|
d721168449
|
Improve setup script & Add a guard for bfloat16 kernels (#130)
|
2023-05-27 00:59:32 -07:00 |
|
Woosuk Kwon
|
667ba3995c
|
Add copyright headers to source files adapted from FT (#104)
|
2023-05-14 22:19:19 -07:00 |
|
Woosuk Kwon
|
130d5fd8c7
|
Fix a bug in attention kernel (#68)
|
2023-05-04 02:56:09 -07:00 |
|
Woosuk Kwon
|
e070829ae8
|
Support bfloat16 data type (#54)
|
2023-05-03 14:09:44 -07:00 |
|
Woosuk Kwon
|
436e523bf1
|
Refactor attention kernels (#53)
|
2023-05-03 13:40:13 -07:00 |
|