iAmir97
7a6c8c3fa1
[Chore] Separate out vllm.utils.network_utils ( #27164 )
...
Signed-off-by: iAmir97 <Amir.balwel@embeddedllm.com>
Co-authored-by: iAmir97 <Amir.balwel@embeddedllm.com>
2025-10-19 03:06:32 -07:00
Cyrus Leung
b3aba04e5a
[Benchmark] Convenience script for multiple parameter combinations ( #27085 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-18 23:57:01 -07:00
wang.yuqi
f54f85129e
[Model][2/N] Improve all pooling task | Support multi-vector retrieval ( #25370 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-10-15 11:14:41 +00:00
Max Wittig
fd85c9f426
[Bugfix][FE]: Always include usage with --enable-force-include-usage ( #20983 )
...
Signed-off-by: Max Wittig <max.wittig@siemens.com>
Signed-off-by: Antoine Auger <antoineauger@users.noreply.github.com>
Co-authored-by: Antoine Auger <antoineauger@users.noreply.github.com>
2025-10-14 09:17:39 +02:00
Lucia Fang
8317f72354
[Misc][DP] support customized aggregated logger for dp ( #24354 )
...
Signed-off-by: Lu Fang <fanglu@fb.com>
2025-10-13 17:45:59 -07:00
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y ( #26633 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Cyrus Leung
4bdf7ac593
[Bugfix] Fix SHM cache initialization ( #26427 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-09 02:48:04 -07:00
Harry Mellor
4e256cadc2
Remove all references to yapf as it's no longer used ( #26251 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 09:18:11 -07:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort ( #26247 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Isotr0py
a42d2df75f
[Frontend] Cache chat template kwargs resolution ( #26227 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-04 15:32:30 +00:00
Russell Bryant
7977e5027c
Add filtering for chat template kwargs ( #25794 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-09-27 10:46:49 +00:00
Russell Bryant
3f5d902d2a
Validate API tokens in constant time ( #25781 )
...
Signed-off-by: rentianyue-jk <rentianyue-jk@360shuke.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: rentianyue-jk <rentianyue-jk@360shuke.com>
2025-09-27 18:09:26 +08:00
wang.yuqi
7f570f1caa
[V0 deprecation] Remove unreachable model_config.supported_tasks ( #25642 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-09-25 11:26:31 +00:00
Cyrus Leung
6c117cff7d
[Frontend] Pass API server count to each process ( #23717 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-20 01:15:19 +08:00
Woosuk Kwon
e19bce40a1
[V0 Deprecation] Remove AsyncLLMEngine ( #25025 )
...
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-18 11:07:42 -07:00
dongbo910220
67244c86f0
feat(api): Return 503 on /health when engine is dead ( #24897 )
...
Signed-off-by: dongbo910220 <1275604947@qq.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-18 14:29:40 +00:00
Aaron Pham
29283e8976
[Chore] Cleanup guided namespace, move to structured outputs config ( #22772 )
...
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-18 09:20:27 +00:00
Andrew Xia
bff2e5f1d6
[gpt-oss][2] fix types for streaming ( #24556 )
...
Signed-off-by: Andrew Xia <axia@meta.com>
2025-09-17 22:04:28 +00:00
Woosuk Kwon
5801e49776
[V0 Deprecation] Remove MQLLMEngine ( #25019 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-16 21:29:27 -07:00
Andrew Xia
73df49ef3a
[gpt-oss][1a] create_responses stream outputs BaseModel type, api server is SSE still ( #24759 )
...
Signed-off-by: Andrew Xia <axia@meta.com>
2025-09-15 13:08:08 -07:00
Chen Zhang
1116590b16
[gpt-oss] Validate gpt-oss python tool during initialization ( #23856 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-09-09 08:37:48 +00:00
wuhang
a38f8bd54c
[Feature][Responses API]Support MCP tools with streaming mode + background mode ( #23927 )
...
Signed-off-by: wuhang <wuhang6@huawei.com>
2025-09-04 04:05:10 +00:00
Christian Pinto
1cb39dbcdd
[Misc] IO Processor plugins for pooling models ( #22820 )
...
Signed-off-by: Christian Pinto <christian.pinto@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
2025-08-31 23:07:12 -07:00
wang.yuqi
d9e00dbd1f
[Performance] V1 Classify Models E2E Performance Optimization ( #23541 )
...
Signed-off-by: wang.yuqi <noooop@126.com>
2025-08-29 03:12:32 -07:00
Didier Durand
d3da2eea54
[Doc]: fix typos in Python scripts ( #23828 )
...
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-08-28 05:37:38 -07:00
Chen Zhang
3210264421
[Frontend] Add --log-error-stack to print stack trace for error response ( #22960 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-27 04:58:59 +00:00
Matúš Námešný
384dd1b0a8
[Bugfix] Add missing enable_log_outputs parameter to init_app_state function ( #23634 )
...
Signed-off-by: Matúš Námešný <matus.namesny@ameria.com>
2025-08-26 12:13:15 +00:00
Chen Zhang
b95697d731
[Frontend] improve error logging of chat completion ( #22957 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-20 13:03:37 -07:00
Russell Bryant
f77a0802b7
Limit HTTP header count and size ( #23267 )
...
Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
2025-08-20 17:57:37 +00:00
22quinn
f7cf5b512e
[Frontend] Add /collective_rpc API endpoint ( #23075 )
...
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-08-19 17:29:32 +00:00
Csrayz
a0632a3e03
[Frontend] Expose do_log_stats interval to env ( #22905 )
...
Signed-off-by: Csrayz <jover@cmbchina.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-15 13:00:20 +00:00
yyweiss
baece8c3d2
[Frontend] Add unix domain socket support ( #18097 )
...
Signed-off-by: <yyweiss@gmail.com>
Signed-off-by: yyw <yyweiss@gmail.com>
2025-08-08 16:23:44 -07:00
Chen Zhang
fe6d8257a1
[gpt-oss] Support tool call and implement MCP tool server ( #22427 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-08 15:06:37 -07:00
Moritz Sanft
370661856b
[Frontend] Update OpenAI error response to upstream format ( #22099 )
...
Signed-off-by: Moritz Sanft <58110325+msanft@users.noreply.github.com>
2025-08-06 23:06:00 -07:00
Lionel Villard
ad6c655dde
preload heavy modules when mp method is forkserver ( #22214 )
...
Signed-off-by: Lionel Villard <villard@us.ibm.com>
2025-08-06 20:33:24 -07:00
Chen Zhang
19c9365aa4
[gpt-oss] add demo tool server ( #22393 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-08-06 17:47:14 -07:00
Nick Hill
8d524ce79f
[BugFix] Improve internal DP load balancing ( #21617 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-08-01 19:45:27 -07:00
Harry Mellor
2d7b09b998
Deprecate --disable-log-requests and replace with --enable-log-requests ( #21739 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-01 17:16:37 +01:00
Nick Hill
3146519add
[BugFix] Don't change title of top-level process ( #22032 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-08-01 07:37:55 -07:00
wuhang
e6680f9e25
[Bugfix] Add log prefix in non-dp mode engine core ( #21889 )
...
Signed-off-by: wuhang <wuhang6@huawei.com>
2025-08-01 09:04:16 +00:00
Yan Pashkovsky
bf668b5bf5
[Feature] Support multiple api keys in server ( #18548 )
...
Signed-off-by: Yan Pashkovsky <yanp.bugz@gmail.com>
2025-07-30 07:03:23 -07:00
Nick Hill
7234fe2685
[Misc] Rework process titles ( #21780 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-07-29 05:14:47 +00:00
rongfu.leng
2cc571199b
[feature] add log non default args in LLM ( #21680 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-07-28 02:21:22 -07:00
Cyrus Leung
86ae693f20
[Deprecation][2/N] Replace --task with --runner and --convert ( #21470 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-27 19:42:40 -07:00
Cyrus Leung
46d81d6951
[V1] Get supported tasks from model runner instead of model config ( #21585 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-25 05:36:45 -07:00
Cyrus Leung
34ddcf9ff4
[Frontend] run-batch supports V1 ( #21541 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-24 20:05:55 -07:00
Chauncey
6da0078523
[Feat] Allow custom naming of vLLM processes ( #21445 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-07-24 03:15:23 -07:00
Shintarou Okada
6eca337ce0
Replace --expand-tools-even-if-tool-choice-none with --exclude-tools-when-tool-choice-none for v0.10.0 ( #20544 )
...
Signed-off-by: okada <kokuzen@gmail.com>
Signed-off-by: okada shintarou <okada@preferred.jp>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-24 02:56:36 -07:00
Michael Goin
82ec66f514
[V0 Deprecation] Remove Prompt Adapters ( #20588 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-23 16:36:48 -07:00
Cyrus Leung
042af0c8d3
[Model][1/N] Support multiple poolers at model level ( #21227 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-21 02:22:21 -07:00