Nick Hill
|
1325872ec8
|
[Frontend] Avoid creating guided decoding LogitsProcessor unnecessarily (#9521)
|
2024-10-18 20:21:01 -07:00 |
|
youkaichao
|
cbc2ef5529
|
[misc] hide best_of from engine (#9261)
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
|
2024-10-10 21:30:44 -07:00 |
|
Travis Johnson
|
480b7f40cf
|
[Misc] Improve validation errors around best_of and n (#9167)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2024-10-09 04:54:48 +00:00 |
|
youkaichao
|
18b296fdb2
|
[core] remove beam search from the core (#9105)
|
2024-10-07 05:47:04 +00:00 |
|
Brendan Wong
|
168cab6bbf
|
[Frontend] API support for beam search (#9087)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-10-05 23:39:03 -07:00 |
|
Joe Runde
|
062c89e7c9
|
[Frontend][Core] Move guided decoding params into sampling params (#8252)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-10-01 09:34:25 +08:00 |
|
youkaichao
|
1e7d5c01f5
|
[misc] soft drop beam search (#8763)
|
2024-09-24 15:48:39 -07:00 |
|
saumya-saran
|
b28298f2f4
|
[Bugfix] Validate SamplingParam n is an int (#8548)
|
2024-09-20 12:46:02 -07:00 |
|
Nick Hill
|
551ce01078
|
[Core] Add engine option to return only deltas or final output (#7381)
|
2024-09-12 12:02:00 -07:00 |
|
Cyrus Leung
|
baaedfdb2d
|
[mypy] Enable following imports for entrypoints (#7248)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Fei <dfdfcai4@gmail.com>
|
2024-08-20 23:28:21 -07:00 |
|
SangBin Cho
|
ff7ec82c4d
|
[Core] Optimize SPMD architecture with delta + serialization optimization (#7109)
|
2024-08-18 17:57:20 -07:00 |
|
Chang Su
|
c134a46402
|
Fix empty output when temp is too low (#2937)
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2024-08-14 05:31:44 +00:00 |
|
Atilla Akkuş
|
7b261092de
|
[BUGFIX]: top_k is expected to be an integer. (#7227)
|
2024-08-07 00:32:16 -07:00 |
|
Peng Guanwen
|
db9e5708a9
|
[Core] Reduce unnecessary compute when logprobs=None (#6532)
|
2024-07-29 16:47:31 +00:00 |
|
Woosuk Kwon
|
bdf5fd1386
|
[Misc] Remove deprecation warning for beam search (#6659)
|
2024-07-23 00:21:58 +00:00 |
|
Simon Mo
|
32c9d7f765
|
Report usage for beam search (#6404)
|
2024-07-14 19:37:35 -07:00 |
|
Woosuk Kwon
|
eeceadaecc
|
[Misc] Add deprecation warning for beam search (#6402)
|
2024-07-13 11:52:22 -07:00 |
|
Nick Hill
|
365791ff81
|
[BugFix] Fix min_tokens behaviour for multiple eos tokens (#5849)
|
2024-06-27 11:31:11 -07:00 |
|
Elisei Smirnov
|
e3470f8753
|
[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985)
Co-authored-by: Elisei Smirnov <el.smirnov@innopolis.university>
|
2024-05-23 22:04:24 +00:00 |
|
sasha0552
|
69909126a7
|
[Bugfix] Use random seed if seed is -1 (#4531)
|
2024-05-01 10:41:17 -07:00 |
|
Li, Jiang
|
dd1a50a8bc
|
[Bugfix][Minor] Make ignore_eos effective (#4468)
|
2024-04-30 16:33:33 -07:00 |
|
Nick Hill
|
81661da7b2
|
[BugFix] Fix min_tokens when eos_token_id is None (#4389)
Co-authored-by: DefTruth <31974251+deftruth@users.noreply.github.com>
|
2024-04-27 09:52:46 -07:00 |
|
Simon Mo
|
a134ef6f5e
|
Support eos_token_id from generation_config.json (#4182)
|
2024-04-19 04:13:36 +00:00 |
|
SangBin Cho
|
09473ee41c
|
[mypy] Add mypy type annotation part 1 (#4006)
|
2024-04-12 14:35:50 -07:00 |
|
Nick Hill
|
e46a60aa4c
|
[BugFix] Fix handling of stop strings and stop token ids (#3672)
|
2024-04-11 15:34:12 -07:00 |
|
Thomas Parnell
|
1d7c940d74
|
Add option to completion API to truncate prompt tokens (#3144)
|
2024-04-05 10:15:42 -07:00 |
|
Matthias Gerstgrasser
|
aabe8f40f2
|
[Core] [Frontend] Make detokenization optional (#3749)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-04-03 21:52:18 -07:00 |
|
Travis Johnson
|
c13ad1b7bd
|
feat: implement the min_tokens sampling parameter (#3124)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-03-25 10:14:26 -07:00 |
|
Zhuohan Li
|
2f8844ba08
|
Re-enable the 80 char line width limit (#3305)
|
2024-03-10 19:49:14 -07:00 |
|
Nick Hill
|
29a8d6a554
|
[Fix] Don't deep-copy LogitsProcessors when copying SamplingParams (#3099)
|
2024-02-29 19:20:42 +00:00 |
|
Nick Hill
|
7d2dcce175
|
Support per-request seed (#2514)
|
2024-02-21 11:47:00 -08:00 |
|
Nikola Borisov
|
3209b49033
|
[Bugfix] fix crash if max_tokens=None (#2570)
|
2024-01-23 22:38:55 -08:00 |
|
Roy
|
9140561059
|
[Minor] Fix typo and remove unused code (#2305)
|
2024-01-02 19:23:15 -08:00 |
|
Yunfeng Bai
|
c06170cc8e
|
Add a flag to include stop string in output text (#1976)
|
2023-12-15 00:45:58 -08:00 |
|
Roy
|
60dc62dc9e
|
add custom server params (#1868)
|
2023-12-03 12:59:18 -08:00 |
|
Jerry
|
f86bd6190a
|
Fix the typo in SamplingParams' docstring (#1886)
|
2023-12-01 02:06:36 -08:00 |
|
ljss
|
de23687d16
|
Fix repetition penalty aligned with huggingface (#1577)
|
2023-11-22 14:41:44 -08:00 |
|
ljss
|
4cea74c73b
|
Set top_p=0 and top_k=-1 in greedy sampling (#1748)
|
2023-11-22 12:51:09 -08:00 |
|
陈序
|
094f716bf2
|
Add stop_token_ids in SamplingParams.__repr__ (#1745)
|
2023-11-21 20:13:53 -08:00 |
|
Roy
|
e87557b069
|
Support Min P Sampler (#1642)
|
2023-11-17 16:20:49 -08:00 |
|
Noam Gat
|
555bdcc5a3
|
Added logits processor API to sampling params (#1469)
|
2023-11-03 14:12:15 -07:00 |
|
Dan Lord
|
7013a80170
|
Add support for spaces_between_special_tokens
|
2023-10-30 16:52:56 -07:00 |
|
ljss
|
69be658bba
|
Support repetition_penalty (#1424)
|
2023-10-29 10:02:41 -07:00 |
|
Zhuohan Li
|
9d9072a069
|
Implement prompt logprobs & Batched topk for computing logprobs (#1328)
Co-authored-by: Yunmo Chen <16273544+wanmok@users.noreply.github.com>
|
2023-10-16 10:56:50 -07:00 |
|
Woosuk Kwon
|
84e4e37d14
|
[Minor] Fix type annotations (#1238)
|
2023-10-02 15:28:31 -07:00 |
|
Dan Lord
|
20f7cc4cde
|
Add skip_special_tokens sampling params (#1186)
|
2023-09-27 19:21:42 -07:00 |
|
Zhuohan Li
|
947b794146
|
[Sampler] Vectorized sampling (simplified) (#1048)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
|
2023-09-22 17:48:04 -07:00 |
|
Ricardo Lu
|
f98b745a81
|
feat: support stop_token_ids parameter. (#1097)
|
2023-09-21 15:34:02 -07:00 |
|
Zhuohan Li
|
002800f081
|
Align vLLM's beam search implementation with HF generate (#857)
|
2023-09-04 17:29:42 -07:00 |
|
wangcx18
|
0c04ce3234
|
Fix typo in sampling_params.py (#788)
|
2023-08-18 10:12:46 +09:00 |
|