20 Commits

Author SHA1 Message Date
Noam Gat
555bdcc5a3
Added logits processor API to sampling params (#1469) 2023-11-03 14:12:15 -07:00
Antoni Baum
15f5632365
Delay GPU->CPU sync in sampling (#1337) 2023-10-30 09:01:34 -07:00
ljss
69be658bba
Support repetition_penalty (#1424) 2023-10-29 10:02:41 -07:00
Woosuk Kwon
c1376e0f82
Change scheduler & input tensor shape (#1381) 2023-10-16 17:48:42 -07:00
Zhuohan Li
9d9072a069
Implement prompt logprobs & Batched topk for computing logprobs (#1328)
Co-authored-by: Yunmo Chen <16273544+wanmok@users.noreply.github.com>
2023-10-16 10:56:50 -07:00
yhlskt23
91fce82c6f
change the timing of sorting logits (#1309) 2023-10-10 19:37:42 -07:00
Zhuohan Li
ba0bfd40e2
TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
Woosuk Kwon
84e4e37d14
[Minor] Fix type annotations (#1238) 2023-10-02 15:28:31 -07:00
Zhuohan Li
f187877945
[FIX] Simplify sampler logic (#1156) 2023-09-23 17:21:56 -07:00
Zhuohan Li
947b794146
[Sampler] Vectorized sampling (simplified) (#1048)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2023-09-22 17:48:04 -07:00
Zhuohan Li
f04908cae7
[FIX] Minor bug fixes (#1035)
* [FIX] Minor bug fixes

* Address review comments
2023-09-13 16:38:12 -07:00
Zhuohan Li
002800f081
Align vLLM's beam search implementation with HF generate (#857) 2023-09-04 17:29:42 -07:00
Dong-Yong Lee
e11222333f
fix: bug fix when penalties are negative (#913)
Co-authored-by: dongyong-lee <dongyong.lee@navercorp.com>
2023-09-01 00:37:17 +09:00
Aman Gupta Karmani
28873a2799
Improve _prune_hidden_states micro-benchmark (#707) 2023-08-31 13:28:43 +09:00
Woosuk Kwon
94d2f59895
Set replacement=True in torch.multinomial (#858) 2023-08-25 12:22:01 +09:00
Abraham-Xu
d1744376ae
Align with huggingface Top K sampling (#753) 2023-08-15 16:44:33 -07:00
Andre Slavescu
c894836108
[Model] Add support for GPT-J (#226)
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
2023-07-08 17:55:16 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
Lily Liu
425040d4c1
remove floats == 0 comparison (#285) 2023-06-28 14:11:51 -07:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00