62 Commits

Author SHA1 Message Date
Grant Pinkert
f878c8feb0
[Feature]: Add OpenAI server prompt_logprobs support #6508 (#7453) 2024-08-16 02:38:08 +00:00
Pooya Davoodi
249b88228d
[Frontend] Support embeddings in the run_batch API (#7132)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-08-09 09:48:21 -07:00
Cyrus Leung
66d617e343
[Frontend] Gracefully handle missing chat template and fix CI failure (#7238)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-08-07 09:12:05 +00:00
Yihuan Bu
654bc5ca49
Support for guided decoding for offline LLM (#6878)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-08-04 03:12:09 +00:00
zifeitong
3c10591ef2
[Bugfix] Set SamplingParams.max_tokens for OpenAI requests if not provided by user (#6954) 2024-07-31 21:13:34 -07:00
Nick Hill
9f69d8245a
[Frontend] New allowed_token_ids decoding request parameter (#6753) 2024-07-29 23:37:27 +00:00
Cyrus Leung
739b61a348
[Frontend] Refactor prompt processing (#4028)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-07-22 10:13:53 -07:00
Cyrus Leung
d7f4178dd9
[Frontend] Move chat utils (#6602)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-07-21 08:38:17 +08:00
sasha0552
7a3d2a5b95
[Frontend] Support for chat completions input in the tokenize endpoint (#5923) 2024-07-16 20:18:09 +08:00
zifeitong
b47008b4d2
[BugFix] BatchResponseData body should be optional (#6345)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-07-15 04:06:09 +00:00
jvlunteren
f1e15da6fe
[Frontend] Continuous usage stats in OpenAI completion API (#5742) 2024-07-05 10:37:09 -07:00
danieljannai21
2c37540aa6
[Frontend] Add template related params to request (#5709) 2024-07-01 23:01:57 -07:00
llmpros
c6c240aa0a
[Frontend]: Support base64 embedding (#5935)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-06-30 23:53:00 +08:00
Robert Shaw
6a62cb82cc
[Bugfix] Fix Engine Failing After Invalid Request - AsyncEngineDeadError (#5963)
Co-authored-by: Robert Shaw <rshaw@neuralmagic>
2024-06-28 17:46:30 -04:00
sasha0552
c54269d967
[Frontend] Add tokenize/detokenize endpoints (#5054) 2024-06-26 16:54:22 +00:00
zifeitong
3ce2c050dd
[Fix] Correct OpenAI batch response format (#5554) 2024-06-15 16:57:54 -07:00
Cyrus Leung
640052b069
[Bugfix][Frontend] Cleanup "fix chat logprobs" (#5026) 2024-06-10 22:36:46 -07:00
maor-ps
351d5e7b82
[Bugfix] OpenAI entrypoint limits logprobs while ignoring server defined --max-logprobs (#5312)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-06-11 10:30:31 +08:00
Itay Etelis
774d1035e4
[Feature][Frontend]: Continued stream_options implementation also in CompletionRequest (#5319) 2024-06-10 14:22:09 +00:00
Itay Etelis
baa15a9ec3
[Feature][Frontend]: Add support for stream_options in ChatCompletionRequest (#5135) 2024-06-07 03:29:24 +00:00
tomeras91
f0a500545f
[Frontend] OpenAI API server: Add add_special_tokens to ChatCompletionRequest (default False) (#5278) 2024-06-05 09:32:58 -07:00
Breno Faria
f775a07e30
[FRONTEND] OpenAI tools support named functions (#5032) 2024-06-03 18:25:29 -05:00
Avinash Raj
f790ad3c50
[Frontend][OpenAI] Support for returning max_model_len on /v1/models response (#4643) 2024-06-02 08:06:13 +00:00
Breno Faria
87d41c849d
[BUGFIX] [FRONTEND] Correct chat logprobs (#5029)
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
2024-05-30 02:52:14 -07:00
Itay Etelis
7c3604fb68
[Bugfix] logprobs is not compatible with the OpenAI spec #4795 (#5031) 2024-05-29 16:13:22 -07:00
Alex Wu
52f8107cf2
[Frontend] Support OpenAI batch file format (#4794)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
2024-05-15 19:13:36 -04:00
Cyrus Leung
fc0d9dfc3a
[Frontend] Re-enable custom roles in Chat Completions API (#4758) 2024-05-15 14:58:46 -07:00
Chang Su
e254497b66
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
sasha0552
c47ba4aaa9
[Bugfix] Add validation for seed (#4529) 2024-05-01 19:31:22 +00:00
Robert Caulk
c3845d82dc
Allow user to define whitespace pattern for outlines (#4305) 2024-04-30 20:48:39 -07:00
Cyrus Leung
8947bc3c15
[Frontend][Bugfix] Disallow extra fields in OpenAI API (#4355) 2024-04-27 05:08:24 +00:00
SangBin Cho
0ae11f78ab
[Mypy] Part 3 fix typing for nested directories for most of directory (#4161) 2024-04-22 21:32:44 -07:00
Noam Gat
05434764cd
LM Format Enforcer Guided Decoding Support (#3868)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-04-16 05:54:57 +00:00
Thomas Parnell
1d7c940d74
Add option to completion API to truncate prompt tokens (#3144) 2024-04-05 10:15:42 -07:00
Roy
f510395bbf
[BugFix][Frontend] Fix completion logprobs=0 error (#3731) 2024-03-29 09:38:21 -07:00
Nick Hill
dfeb2ecc3a
[Misc] Include matched stop string/token in responses (#2976)
Co-authored-by: Sahil Suneja <sahilsuneja@gmail.com>
2024-03-25 17:31:32 -07:00
Travis Johnson
c13ad1b7bd
feat: implement the min_tokens sampling parameter (#3124)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
2024-03-25 10:14:26 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
Simon Mo
ef65dcfa6f
[Doc] Add docs about OpenAI compatible server (#3288) 2024-03-18 22:05:34 -07:00
Simon Mo
120157fd2a
Support arbitrary json_object in OpenAI and Context Free Grammar (#3211) 2024-03-16 13:35:27 -07:00
felixzhu555
703e42ee4b
Add guided decoding for OpenAI API server (#2819)
Co-authored-by: br3no <breno@veltefaria.de>
Co-authored-by: simon-mo <simon.mo@hey.com>
2024-02-29 22:13:08 +00:00
Jae-Won Chung
a6d471c759
Fix: AttributeError in OpenAI-compatible server (#3018) 2024-02-28 22:04:07 -08:00
Dylan Hawk
e0ade06d63
Support logit bias for OpenAI API (#3027) 2024-02-27 11:51:53 +08:00
Jared Moore
70f3e8e3a1
Add LogProbs for Chat Completions in OpenAI (#2918) 2024-02-26 10:39:34 +08:00
Mustafa Eyceoz
5574081c49
Added early stopping to completion APIs (#2939) 2024-02-21 18:24:01 -08:00
Nick Hill
7d2dcce175
Support per-request seed (#2514) 2024-02-21 11:47:00 -08:00
Federico Galatolo
f1f6cc10c7
Added include_stop_str_in_output and length_penalty parameters to OpenAI API (#2562) 2024-01-24 10:21:56 -08:00
Jannis Schönleber
71d63ed72e
migrate pydantic from v1 to v2 (#2531) 2024-01-21 16:05:56 -08:00
Simon Mo
dd7e8f5f64
refactor complemention api for readability (#2499) 2024-01-18 16:45:14 -08:00
Roy
60dc62dc9e
add custom server params (#1868) 2023-12-03 12:59:18 -08:00