971 Commits

Author SHA1 Message Date
Antoni Baum
9925c17940
Ray placement group support (#397) 2023-07-19 22:49:31 -07:00
Massimiliano Pronesti
16c3e295a8
fix(ray_utils): ignore re-init error (#465) 2023-07-19 17:01:19 -07:00
Lily Liu
b4b195b360
fix max seq len (#489) 2023-07-17 23:20:20 -07:00
Zhuohan Li
2bdea7ac11
[Fix] Fix the condition of max_seq_len (#477) 2023-07-17 00:33:48 -04:00
Zhangir Azerbayev
6d7d95a70a
Offload port selection to OS (#467) 2023-07-15 23:11:02 -07:00
xcnick
c6dfc3cdbe
Fix handling of special tokens in decoding. (#418) 2023-07-12 11:14:56 -04:00
codethazine
a945fcc2ae
Add trust-remote-code flag to handle remote tokenizers (#364) 2023-07-07 11:04:58 -07:00
coolcloudcol
7717d0838b
Fix an endless loop issue when engine_step throws a RuntimeError (#339) 2023-07-03 15:22:28 -07:00
Zhuohan Li
42e0c1df78
[Quality] Add CI for formatting (#343) 2023-07-03 14:50:56 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
Lily Liu
dafd924c1f
Raise error for long prompt (#273) 2023-06-30 18:48:49 -07:00
Woosuk Kwon
998d9d1509
[Tokenizer] Add tokenizer mode (#298) 2023-06-28 14:19:22 -07:00
Woosuk Kwon
4338cc4750
[Tokenizer] Add an option to specify tokenizer (#284) 2023-06-28 09:46:58 -07:00
Zhuohan Li
0b7db411b5
[Bug] Fix the OOM condition for CPU cache (#260) 2023-06-26 11:16:13 -07:00
metacryptom
0603379863
fix wrong using getattr to get dict value (#232) 2023-06-24 22:00:24 -07:00
Zhuohan Li
1d24ccb96c
[Fix] Better error message when there is OOM during cache initialization (#203) 2023-06-22 15:30:06 +08:00
Woosuk Kwon
14f0b39cda
[Bugfix] Fix a bug in RequestOutput.finished (#202) 2023-06-22 00:17:24 -07:00
Zhuohan Li
2e0d314384
fix-ray (#193) 2023-06-22 00:21:41 +08:00
Woosuk Kwon
67d96c29fb
Use slow tokenizer for open llama models (#168) 2023-06-20 14:19:47 +08:00
Zhuohan Li
bf5f121c02
Reduce GPU memory utilization to make sure OOM doesn't happen (#153) 2023-06-18 17:33:50 +08:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00