Cyrus Leung
ad430a67ca
[Metrics] Log multi-modal cache stats and fix reset ( #26285 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-10 01:45:55 -07:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort ( #26247 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Aaron Pham
6a113d9aed
[V0 Deprecation] Remove vllm.worker and update according imports ( #25901 )
2025-09-29 23:26:11 +00:00
Nick Hill
8b77328ffe
[Misc] Don't log shm dequeue delay warning on worker side ( #25720 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-09-26 01:08:30 +00:00
Woosuk Kwon
7ed82d1974
[V0 Deprecation] Remove V0 MP executor ( #25329 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-20 21:26:35 -07:00
Chao Lei
8de261b04a
[P/D]kv_output_aggregator support P TP > D TP ( #23917 )
...
Signed-off-by: LCAIZJ <leichao139636@163.com>
Co-authored-by: leichao.lc <leichao.lc@antgroup.com>
2025-09-15 11:36:06 +02:00
Nick Hill
4fdd6f5cbf
[Core] Support async scheduling with uniproc executor ( #24219 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Ronald1995 <ronaldautomobile@163.com>
Co-authored-by: Ronald1995 <ronaldautomobile@163.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
2025-09-12 16:34:28 -07:00
Cyrus Leung
010acc6e1e
[Bugfix] Fix incompatibility between #20452 and #24548 ( #24754 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-09-12 11:17:29 -07:00
dongluw
a5b84f1cbf
[Core] Shared memory based object store for Multimodal data caching and IPC ( #20452 )
...
Signed-off-by: donglu <donglu@cohere.com>
2025-09-12 07:54:17 -07:00
22quinn
0cdd213641
[Misc] Improve Worker process title and logging prefix ( #22205 )
...
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-09-08 21:43:48 -07:00
Chauncey
61aa4b2901
[P/D] Add a shutdown method to the Connector API ( #22699 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-09-07 23:07:00 -07:00
Benjamin Chislett
cee182b297
[Perf][V1] Fully overlap model execution ( #23569 )
...
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
2025-09-05 18:20:17 -07:00
Shiyan Deng
9dfbeb41e5
[RFC] allow cancelation after shutdown in blocking collective_rpc ( #23390 )
...
Signed-off-by: Shiyan Deng <dsy842974287@meta.com>
2025-09-05 14:14:18 -07:00
Nick Hill
d90d8eb674
[BugFix] Async scheduling and PP compatibility with DP ( #23770 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-08-29 08:17:27 -07:00
Hyogeun Oh (오효근)
4e4d017b6f
[Docs] Fix warnings in mkdocs build (continued) ( #23743 )
...
Signed-off-by: Zerohertz <ohg3417@gmail.com>
Signed-off-by: Hyogeun Oh (오효근) <ohg3417@gmail.com>
2025-08-27 17:17:29 +00:00
22quinn
480bdf5a7b
[Core] Support custom executor qualname ( #23314 )
...
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
2025-08-22 09:40:54 +08:00
Woosuk Kwon
c9b38be8aa
[Spec Decode] Make propose_draft_token_ids non-blocking for lower TTFT ( #23041 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-08-18 17:20:38 -07:00
H
24d1dffbeb
[executor] feat: add supports_pp attr to executors ( #21786 )
...
Signed-off-by: Haibin Lin <haibin.lin@bytedance.com>
2025-08-03 18:04:45 +08:00
wuhang
e6680f9e25
[Bugfix] Add log prefix in non-dp mode engine core ( #21889 )
...
Signed-off-by: wuhang <wuhang6@huawei.com>
2025-08-01 09:04:16 +00:00
Nick Hill
7234fe2685
[Misc] Rework process titles ( #21780 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-07-29 05:14:47 +00:00
wuhang
bccc43c033
[Bugfix]check health for engine core process exiting unexpectedly ( #21728 )
...
Signed-off-by: wuhang <wuhang6@huawei.com>
2025-07-28 06:17:31 -07:00
Chauncey
6da0078523
[Feat] Allow custom naming of vLLM processes ( #21445 )
...
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-07-24 03:15:23 -07:00
kourosh hakhamaneshi
9f414a12ad
[BugFix] Make PD work with Ray ( #21072 )
...
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
2025-07-19 08:46:50 -07:00
Rui Qiao
217937221b
Elastic Expert Parallel Initial Support ( #20775 )
...
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2025-07-18 17:46:09 -07:00
David Ben-David
4fcef49ec4
[V1] [KVConnector] Fix MultiprocExecutor worker output aggregation ( #21048 )
...
Signed-off-by: David Ben-David <davidb@pliops.com>
Co-authored-by: David Ben-David <davidb@pliops.com>
2025-07-17 13:29:45 +08:00
Chen LI
10be209493
[Bug Fix] get_distributed_init_method should get the ip from get_ip i… ( #20889 )
...
Signed-off-by: Chen Li <lcpingping@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-07-15 21:23:52 +00:00
Woosuk Kwon
d4d309409f
Implement Async Scheduling ( #19970 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-14 23:01:46 -07:00
Nick Hill
574ad60db9
[KVConnector] Always call connector clear_metadata() at end of step ( #20756 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: David Ben-David <sdavidbd@gmail.com>
2025-07-10 22:37:27 +01:00
Or Ozeri
cc876d0f29
[KVConnector] Aggregate finished requests on the scheduler ( #19555 )
...
Signed-off-by: Or Ozeri <oro@il.ibm.com>
2025-07-10 09:22:18 +01:00
jinqinn
f39ab2d4bd
[Misc] Configurable timeout for execute_model RPC calls via env var ( #19544 )
...
Signed-off-by: jinqinn <goodqinjin@163.com>
2025-06-22 20:36:26 -07:00
jennyyyyzhen
cda10fa3e2
[Multi Modal] Add an env var for message queue max chunk bytes ( #19242 )
...
Signed-off-by: yZhen <yZhen@fb.com>
Co-authored-by: yZhen <yZhen@fb.com>
2025-06-08 21:39:12 +08:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText ( #19100 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Nick Hill
93ecb8139c
[BugFix] Increase TP execute_model timeout ( #18558 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-05-23 10:22:11 +08:00
Rabi Mishra
acb54ca8e1
Intialize io_thread_pool attribute in the beginning. ( #18331 )
...
Signed-off-by: rabi <ramishra@redhat.com>
2025-05-21 20:21:14 -07:00
Nick Hill
ed5272cf21
[BugFix] Avoid secondary missing MultiprocExecutor.workers error ( #17811 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-05-07 21:55:04 +00:00
Li, Jiang
a6fed02068
[V1][PP] Support PP for MultiprocExecutor ( #14219 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Signed-off-by: jiang.li <jiang1.li@intel.com>
2025-05-06 07:58:05 -07:00
Nick Hill
5175b884f7
[BugFix] Remove default multiproc executor collective_rpc timeout ( #17000 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-22 23:27:14 +00:00
Robert Shaw
2b05b8ce69
[V1][Frontend] Improve Shutdown And Logs ( #11737 )
...
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Andrew Feldman <afeldman@neuralmagic.com>
Co-authored-by: afeldman-nm <156691304+afeldman-nm@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-04-16 19:48:34 -07:00
yihong
04149cce27
[BugFix] fix some typos found by typos. ( #16314 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-09 03:43:59 -07:00
youkaichao
a865bc1ca6
[core] do not send error across process ( #16174 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-04-07 19:09:03 -07:00
Nick Hill
15dac210f0
[V1] AsyncLLM data parallel ( #13923 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-03-27 16:14:41 -07:00
Chen Zhang
93a00d7dde
[v1] Refactor KVCacheConfig ( #14079 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-03-21 04:56:27 -07:00
Mickaël Seznec
a597a57595
[Attention] Flash Attention 3 - fp8 ( #14570 )
...
Signed-off-by: Mickael Seznec <mickael@mistral.ai>
2025-03-20 01:14:20 -04:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
Li, Jiang
02296f420d
[Bugfix][V1][Minor] Fix shutting_down flag checking in V1 MultiprocExecutor ( #14053 )
2025-02-28 22:31:01 -08:00
youkaichao
eb24dc4a45
[v1] torchrun compatibility ( #13642 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-23 22:47:24 +08:00
youkaichao
3e472d882a
[core] set up data parallel communication ( #13591 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-22 19:28:59 +08:00
Woosuk Kwon
4fb8142a0e
[V1][PP] Enable true PP with Ray executor ( #13472 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-02-18 09:15:32 -08:00
Cody Yu
9206b3d7ec
[V1][PP] Run engine busy loop with batch queue ( #13064 )
2025-02-15 03:59:01 -08:00
Rui Qiao
9605c1256e
[V1][core] Implement pipeline parallel on Ray ( #12996 )
2025-02-13 08:02:46 +00:00