Cyrus Leung
|
9edca6bf8f
|
[Frontend] Online Pooling API (#11457)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-24 17:54:30 +08:00 |
|
Cyrus Leung
|
8f10d5e393
|
[Misc] Split up pooling tasks (#10820)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-11 01:28:00 -08:00 |
|
Cyrus Leung
|
32e46e000f
|
[Frontend] Automatic detection of chat content format from AST (#9919)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-16 13:35:40 +08:00 |
|
zifeitong
|
47db6ec831
|
[Frontend] Add per-request number of cached token stats (#10174)
|
2024-11-12 16:42:28 +00:00 |
|
Aaron Pham
|
21063c11c7
|
[CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2024-11-06 07:11:55 +00:00 |
|
Cyrus Leung
|
06386a64dd
|
[Frontend] Chat-based Embeddings API (#9759)
|
2024-11-01 08:13:35 +00:00 |
|
Jiaxin Shan
|
260d40b5ea
|
[Core] Support Lora lineage and base model metadata management (#6315)
|
2024-09-20 06:20:56 +00:00 |
|
youkaichao
|
f842a7aff1
|
[misc] remove engine_use_ray (#8126)
|
2024-09-11 18:23:36 -07:00 |
|
Pooya Davoodi
|
cea95dfb94
|
[Frontend] Create ErrorResponse instead of raising exceptions in run_batch (#8347)
|
2024-09-11 05:30:11 +00:00 |
|
Adam Lugowski
|
58fcc8545a
|
[Frontend] Add progress reporting to run_batch.py (#8060)
Co-authored-by: Adam Lugowski <adam.lugowski@parasail.io>
|
2024-09-09 11:16:37 -07:00 |
|
Pooya Davoodi
|
8da48e4d95
|
[Frontend] Publish Prometheus metrics in run_batch API (#7641)
|
2024-08-23 23:04:22 -07:00 |
|
Pooya Davoodi
|
6885fde317
|
[Bugfix] Fix run_batch logger (#7640)
|
2024-08-23 13:58:26 -07:00 |
|
Pooya Davoodi
|
249b88228d
|
[Frontend] Support embeddings in the run_batch API (#7132)
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-08-09 09:48:21 -07:00 |
|
Cyrus Leung
|
739b61a348
|
[Frontend] Refactor prompt processing (#4028)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-22 10:13:53 -07:00 |
|
zifeitong
|
ff9ddbceee
|
[Misc] Remove #4789 workaround left in vllm/entrypoints/openai/run_batch.py (#5756)
|
2024-06-22 03:33:12 +00:00 |
|
Michael Goin
|
8065a7e220
|
[Frontend] Add FlexibleArgumentParser to support both underscore and dash in names (#5718)
|
2024-06-20 17:00:13 -06:00 |
|
zifeitong
|
26e1188e51
|
[Fix] Use utf-8 encoding in entrypoints/openai/run_batch.py (#5606)
|
2024-06-17 23:16:10 +00:00 |
|
zifeitong
|
3ce2c050dd
|
[Fix] Correct OpenAI batch response format (#5554)
|
2024-06-15 16:57:54 -07:00 |
|
Cyrus Leung
|
0e9164b40a
|
[mypy] Enable type checking for test directory (#5017)
|
2024-06-15 04:45:31 +00:00 |
|
Cyrus Leung
|
03dccc886e
|
[Misc] Add vLLM version getter to utils (#5098)
|
2024-06-13 11:21:39 -07:00 |
|
Alex Wu
|
5e0391c040
|
[Frontend] Separate OpenAI Batch Runner usage from API Server (#4851)
|
2024-05-17 00:42:41 +09:00 |
|
Alex Wu
|
52f8107cf2
|
[Frontend] Support OpenAI batch file format (#4794)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
|
2024-05-15 19:13:36 -04:00 |
|