xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-27 14:57:21 +08:00

Author	SHA1	Message	Date
Cyrus Leung	43c4f3d77c	[Misc] Begin deprecation of `get_tensor_model_*_group` (#22494 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-08 01:11:54 -07:00
Shu Wang	b2c8ce57c6	Fix Flashinfer CUTLASS MOE Allgather (#21963 ) Signed-off-by: Shu Wang <shuw@nvidia.com>	2025-08-07 19:18:25 -07:00
WeiQing Chen	4be02a3776	[Bugfix] EPLB load statistics problem (#22167 ) Signed-off-by: ycyaw66 <497410282@qq.com> Signed-off-by: David Chen <530634352@qq.com> Co-authored-by: ycyaw66 <497410282@qq.com>	2025-08-07 04:07:54 +00:00
Ning Xie	74333ae2f6	[Misc] correct static type check for GroupCoordinator (#21946 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-05 03:17:46 -07:00
Ning Xie	bd3db7f469	[Misc] log more detailed message for ensure_model_parallel_initialized (#22144 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-04 19:36:55 -07:00
Ning Xie	29b97c0995	[Doc] add backend to doc string of initialize_model_parallel (#22142 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-04 19:36:20 -07:00
lkchen	f4f4e7ef27	[V0 deprecation][P/D] Deprecate v0 `KVConnectorBase` code (1/2) (#21785 ) Signed-off-by: Linkun Chen <github@lkchen.net>	2025-08-04 19:11:33 -07:00
Ning Xie	c2e75b3c11	remove duplicate code within cleanup_dist_env_and_memory (#22147 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-03 20:03:58 -07:00
David Ben-David	aefeea0fde	[V1] [P/D] Refactor KV Connector Path (#21980 ) Signed-off-by: David Ben-David <davidb@pliops.com> Co-authored-by: David Ben-David <davidb@pliops.com>	2025-08-03 04:03:40 -07:00
Ning Xie	7de45db9a5	[Misc] update doc comment for send (#22026 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-08-03 00:55:20 -07:00
Rui Qiao	d331759488	Introduce RayPPCommunicator for ray-based PP (#21660 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-08-01 11:50:58 -07:00
wxsm	f4135232b9	feat(distributed): add `get_required_kvcache_layout` class method to kv connector api (#20433 ) Signed-off-by: wxsm <wxsms@foxmail.com>	2025-07-30 16:41:51 +00:00
Chenguang Zheng	4904e53c32	[Bugfix] SharedStorage Connector for V1 PD multimodal (#21611 ) Signed-off-by: fake0fan <645327136@qq.com> Signed-off-by: herotai214 <herotai214@gmail.com> Co-authored-by: herotai214 <herotai214@gmail.com>	2025-07-30 09:18:37 -07:00
Calvin Chen	e18f085103	skip fusedmoe layer for start_load_kv (#21378 ) Signed-off-by: calvin chen <wen.chen@dynamia.ai>	2025-07-28 18:59:44 -07:00
Kuntai Du	b18b417fbf	Revert "[V1] Exception Handling when Loading KV Cache from Remote Store" (#21778 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu>	2025-07-28 20:15:18 +00:00
Nick Hill	7d44c691b0	[P/D] Log warnings related to prefill KV expiry (#21753 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-28 18:40:53 +00:00
Adeline	15a72ac478	[V1] Exception Handling when Loading KV Cache from Remote Store (#21534 ) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com>	2025-07-27 20:34:17 -07:00
WeiQing Chen	97d6c30cc9	[BugFix] Fix shared storage connector load kv only load attention layer (#21428 ) Signed-off-by: David Chen <530634352@qq.com>	2025-07-26 07:07:40 -07:00
Juncheng Gu	6066284914	[P/D] Support CPU Transfer in NixlConnector (#18293 ) Signed-off-by: Juncheng Gu <juncgu@gmail.com> Signed-off-by: Richard Liu <ricliu@google.com> Co-authored-by: Richard Liu <39319471+richardsliu@users.noreply.github.com> Co-authored-by: Richard Liu <ricliu@google.com>	2025-07-24 17:58:42 +01:00
Rui Qiao	1e9ea8e69d	[P/D] Move FakeNixlWrapper to test dir (#21328 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-07-24 08:53:45 -07:00
Li, Jiang	a15a50fc17	[CPU] Enable shared-memory based pipeline parallel for CPU backend (#21289 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-07-21 09:07:08 -07:00
kourosh hakhamaneshi	9f414a12ad	[BugFix] Make PD work with Ray (#21072 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2025-07-19 08:46:50 -07:00
Rui Qiao	217937221b	Elastic Expert Parallel Initial Support (#20775 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-07-18 17:46:09 -07:00
Woosuk Kwon	4de7146351	[V0 deprecation] Remove V0 HPU backend (#21131 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-17 16:37:36 -07:00
Zhonghua Deng	8a4e5c5f3c	[V1][P/D]Enhance Performance and code readability for P2pNcclConnector (#20906 ) Signed-off-by: Abatom <abzhonghua@gmail.com>	2025-07-16 22:13:00 -07:00
Trevor Morris	a8593237c0	Add pynccl all-gatherv and reducescatterv (#20154 ) Signed-off-by: Trevor Morris <tmorris@nvidia.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-07-11 18:59:23 -07:00
Varun Sundar Rabindranath	53fa457391	[Misc] Add unit tests for MoE ModularKernel combinations + Profiling utility (#20449 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-07-11 07:51:46 -07:00
Nick Hill	574ad60db9	[KVConnector] Always call connector `clear_metadata()` at end of step (#20756 ) Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: David Ben-David <sdavidbd@gmail.com>	2025-07-10 22:37:27 +01:00
Or Ozeri	cc876d0f29	[KVConnector] Aggregate finished requests on the scheduler (#19555 ) Signed-off-by: Or Ozeri <oro@il.ibm.com>	2025-07-10 09:22:18 +01:00
Yiming	cd587c93ef	[BugFix]: Properly set engine_id when using multi connector (#19487 ) Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: leiyiming <leiyiming@kingsoft.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-07-09 20:32:44 +00:00
Liangliang Ma	a3e4e85ece	[XPU][CI] enhance xpu test support (#20652 ) Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>	2025-07-09 16:53:09 +00:00
Nicolò Lucchesi	71d1d75b7a	[PD][Nixl] Remote consumer READ timeout for clearing request blocks (#20139 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-07-08 08:56:40 +01:00
Jee Jee Li	1caca5a589	[Misc] Add SPDX-FileCopyrightText (#20428 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-04 07:40:42 +00:00
Nicolò Lucchesi	8d775dd30a	[Misc] Fix `Unable to detect current VLLM config. Defaulting to NHD kv cache layout` warning (#20400 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-07-03 14:56:09 -07:00
Ning Xie	1dba2c4ebe	[Misc] adjust for ipv6 for mookcacke url parse (#20107 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-07-03 20:27:17 +00:00
Woosuk Kwon	7f280d69c9	[Optimization] Cache sampled token ids in model runner (#20291 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-07-01 11:01:31 -07:00
Nicolò Lucchesi	650d5dbd04	[Misc] Minor refactor of NIXL background handshake (#20068 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-07-01 12:40:14 +01:00
Michael Goin	be250bbc67	[V1] Only print cudagraph tqdm on rank 0 with `is_global_first_rank` (#19516 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-01 06:02:09 +00:00
Zhonghua Deng	ded1fb635b	[Bugfix][V1][P/D]Fix the issue of occasional garbled output for P2pNcclConnector (#20263 ) Signed-off-by: Abatom <abzhonghua@gmail.com>	2025-06-30 16:45:14 -07:00
Woosuk Kwon	2863befce3	[Optimization] Use Shared `CachedRequestData` Instance Across All Requests (#20232 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-06-30 09:07:50 -07:00
Wentao Ye	4d36693687	[Refactor] Create a function util and cache the results for `has_deepgemm`, `has_deepep`, `has_pplx` (#20187 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-06-28 22:06:38 +00:00
li haoyang	0740e29b66	[Feature] add quick all reduce (#19744 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Signed-off-by: Haoyang Li <Haoyang.Li@amd.com> Co-authored-by: ilmarkov <imarkov@redhat.com>	2025-06-26 20:54:24 -07:00
Bowen Wang	e9fd658a73	[Feature] Expert Parallelism Load Balancer (EPLB) (#18343 ) Signed-off-by: Bowen Wang <abmfy@icloud.com>	2025-06-26 15:30:21 -07:00
Nicolò Lucchesi	2582683566	[PD] Skip `tp_size` exchange with rank0 (#19413 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-06-25 20:04:39 -07:00
Nick Hill	55c65ab495	[P/D] Avoid stranding blocks in P when aborted in D's waiting queue (#19223 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-06-25 15:19:44 -07:00
Nick Hill	c40692bf9a	[Misc] Add parallel state `node_count` function (#20045 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-06-25 13:38:53 -07:00
lkchen	91f7d9d0b6	[P/D] Asynchronously do _nixl_handshake (#19836 ) Signed-off-by: Linkun Chen <github@lkchen.net> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-06-24 12:46:10 -07:00
lkchen	d0132f025d	[Misc] Add type alias `ReqId` and `EngineId` for better readability (#19880 ) Signed-off-by: Linkun Chen <github@lkchen.net>	2025-06-23 12:57:57 -07:00
lkchen	1bcd15edc7	[BugFix][P/D] Fix for cases where _recving_transfers can be cleaned up when all transfer done (#19874 ) Signed-off-by: Linkun Chen <github@lkchen.net>	2025-06-22 22:41:53 -07:00
Nicolò Lucchesi	2ebff5b77c	[P/D][NixlConnector] Support `tp_size > num_kv_heads` deployments (#19691 ) Signed-off-by: NickLucche <nlucches@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-06-22 22:41:50 -07:00

1 2 3 4 5 ...

282 Commits