Ce Gao
32b14baf8a
[Refactor][Frontend] Keep all logic about reasoning into one class ( #14428 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
2025-03-28 00:23:30 -07:00
Yuan Tang
66aa4c0bf4
[Feature] Add middleware to log API Server responses ( #15593 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-03-27 17:49:38 +00:00
wwl2755
99f536f830
[Misc] Enhance warning information to user-defined chat template ( #15408 )
...
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
2025-03-26 02:21:15 -07:00
Chauncey
ae65f3e237
[Misc]fixed disable these http request logs ( #14754 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-03-19 21:53:40 -07:00
maobaolong
26dd972adb
[FEAT]Support reset prefix cache by specified device ( #15003 )
2025-03-19 10:54:41 -07:00
Simon Mo
3b457143d2
[Bugfix] Register serializers for V0 MQ Engine ( #15009 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-03-18 09:14:47 -04:00
Jun Duan
74bc397b0a
[Core] Expose API endpoint /is_sleeping ( #14312 )
...
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
2025-03-15 06:28:14 -07:00
Robert Shaw
d4d93db2c5
[V1] V1 Enablement Oracle ( #13726 )
...
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-03-14 22:02:20 -07:00
daniel-salib
73deea2fdb
[Frontend] track server_load ( #13950 )
2025-03-14 09:53:17 -07:00
Russell Bryant
0b0d6421b2
[Frontend] Fix log message to use http vs https ( #14774 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-14 09:21:09 -07:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
Jee Jee Li
5157338ed9
[Misc] Improve LoRA spelling ( #13831 )
2025-02-25 23:43:01 -08:00
cjackal
51010a1807
[Misc] set single whitespace between log sentences ( #13771 )
...
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2025-02-25 10:26:12 +08:00
Keyun Tong
8db1b9d0a1
Support SSL Key Rotation in HTTP Server ( #13495 )
2025-02-22 05:17:44 -08:00
Gabriel Marinho
1c3c975766
[FEATURE] Enables /score endpoint for embedding models ( #12846 )
2025-02-20 22:09:47 -08:00
youkaichao
ba81163997
[core] add sleep and wake up endpoint and v1 support ( #12987 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: cennn <2523403608@qq.com>
Co-authored-by: cennn <2523403608@qq.com>
2025-02-20 12:41:17 +08:00
zifeitong
d3231cb436
[Bugfix] Handle content type with optional parameters ( #13383 )
...
Signed-off-by: Zifei Tong <zifeitong@gmail.com>
2025-02-18 11:29:13 +00:00
Yuan Tang
a1074b3efe
[Bugfix] Only print out chat template when supplied ( #13444 )
2025-02-17 21:43:31 -08:00
Nicolò Lucchesi
d84cef76eb
[Frontend] Add /v1/audio/transcriptions OpenAI API endpoint ( #12909 )
2025-02-13 07:23:45 -08:00
Vaibhav Jain
37dfa60037
[Bugfix] Missing Content Type returns 500 Internal Server Error ( #13193 )
2025-02-13 06:52:22 -08:00
Russell Bryant
578087e56c
[Frontend] Pass pre-created socket to uvicorn ( #13113 )
2025-02-13 00:51:46 -08:00
Russell Bryant
d46d490c27
[Frontend] Move CLI code into vllm.cmd package ( #12971 )
2025-02-12 23:12:21 -08:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com>
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com>
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Ce Gao
a7e3eba66f
[Frontend] Support reasoning content for deepseek r1 ( #12473 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Michael Goin <mgoin@redhat.com>
2025-01-29 11:38:08 +08:00
Michael Goin
0f657bdc52
Replace missed warning_once for rerank API ( #12472 )
...
Signed-off-by: mgoin <michael@neuralmagic.com>
2025-01-28 19:06:32 +00:00
Kyle Mistele
0034b09ceb
[Frontend] Rerank API (Jina- and Cohere-compatible API) ( #12376 )
...
Signed-off-by: Kyle Mistele <kyle@mistele.com>
2025-01-26 19:58:45 -07:00
Nick Hill
aea94362c9
[Frontend][V1] Online serving performance improvements ( #12287 )
2025-01-22 22:22:12 +00:00
Cody Yu
7206ce4ce1
[Core] Support reset_prefix_cache ( #12284 )
2025-01-22 18:52:27 +00:00
Fred Reiss
c9f09a4fe8
[mypy] Fix mypy warnings in api_server.py ( #11941 )
...
Signed-off-by: Fred Reiss <frreiss@us.ibm.com>
2025-01-11 01:04:58 +00:00
Joe Runde
ac2f3f7fee
[Bugfix] Validate lora adapters to avoid crashing server ( #11727 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-01-10 15:56:36 +08:00
Wallas Henrique
cfd3219f58
[Hardware][Apple] Native support for macOS Apple Silicon ( #11696 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-01-08 16:35:49 +08:00
Rui Qiao
f8fcca100b
[Misc] Fix typo for valid_tool_parses ( #11753 )
...
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2025-01-06 07:12:38 +00:00
Nathan Azrak
68d37809b9
[Misc] Minimum requirements for SageMaker compatibility ( #11576 )
2025-01-02 15:59:25 -08:00
Joe Runde
4db72e57f6
[Bugfix][Refactor] Unify model management in frontend ( #11660 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2025-01-01 02:21:51 +00:00
Robert Shaw
5886aa496e
[V1] [6/N] API Server: Better Shutdown ( #11586 )
2024-12-30 15:51:02 +00:00
Robert Shaw
df04dffade
[V1] [4/N] API Server: ZMQ/MP Utilities ( #11541 )
2024-12-28 01:45:08 +00:00
Robert Shaw
55fb97f7bd
[2/N] API Server: Avoid ulimit footgun ( #11530 )
2024-12-26 23:43:05 +00:00
Robert Shaw
720b10fdc6
[1/N] API Server (Remove Proxy) ( #11529 )
2024-12-26 23:03:43 +00:00
Cyrus Leung
9edca6bf8f
[Frontend] Online Pooling API ( #11457 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-24 17:54:30 +08:00
Ricky Xu
584f0ae40d
[V1] Make AsyncLLMEngine v1-v0 opaque ( #11383 )
...
Signed-off-by: Ricky Xu <xuchen727@hotmail.com>
2024-12-21 15:14:08 +08:00
Joe Runde
2d1b9baa8f
[Bugfix] Fix request cancellation without polling ( #11190 )
2024-12-17 12:26:32 -08:00
Cyrus Leung
0920ab9131
[Doc] Reorganize online pooling APIs ( #11172 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-14 00:22:22 +08:00
Russell Bryant
ccede2b264
[Core] cleanup zmq ipc sockets on exit ( #11115 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-12-11 19:12:24 -08:00
Cyrus Leung
8f10d5e393
[Misc] Split up pooling tasks ( #10820 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-11 01:28:00 -08:00
Joe Runde
980ad394a8
[Frontend] Use request id from header ( #10968 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-12-10 13:46:29 +08:00
Russell Bryant
69d357ba12
[Core] Cleanup startup logging a bit ( #10961 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-12-07 02:30:23 +00:00
Ricky Xu
519e8e4182
[v1] EngineArgs for better config handling for v1 ( #10382 )
...
Signed-off-by: rickyx <rickyx@anyscale.com>
2024-11-25 21:09:43 -08:00
Chauncey
d04b13a380
[Bug]: Authorization ignored when root_path is set ( #10606 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2024-11-25 16:21:41 +00:00
Maximilien de Bayser
214efc2c3c
Support Cross encoder models ( #10400 )
...
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Co-authored-by: Flavia Beo <flavia.beo@ibm.com>
2024-11-24 18:56:20 -08:00
Cyrus Leung
32e46e000f
[Frontend] Automatic detection of chat content format from AST ( #9919 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-16 13:35:40 +08:00