xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-10 10:54:40 +08:00

Author	SHA1	Message	Date
hoshi-hiyouga	414919138b	[Bugfix] torchrun compatibility (#14899 ) Signed-off-by: hiyouga <hiyouga@buaa.edu.cn> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2025-03-18 05:49:27 -07:00
Jiayi Yao	2f42a4888c	[Feature] Support KV cache offloading and disagg prefill with LMCache connector. (#12953 )	2025-02-25 00:38:42 -08:00
youkaichao	3e472d882a	[core] set up data parallel communication (#13591 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-22 19:28:59 +08:00
youkaichao	a0231b7c25	[platform] add base class for communicators (#13208 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-16 22:14:22 +08:00
ZSL98	433c4a4923	Make vllm compatible with verl (#12824 ) Co-authored-by: zhangshulai <zhangshulai@bytedance.com>	2025-02-07 11:54:20 +08:00
Akash kaothalkar	022bcc701a	[Bugfix] Fix 'ModuleNotFoundError: No module named 'intel_extension_for_pytorch'' for --tensor-parallel-size more than 1 (#12546 )	2025-02-04 23:11:02 -08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Harry Mellor	823ab79633	Update `pre-commit` hooks (#12475 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-27 17:23:08 -07:00
Jani Monoses	9c485d9e25	[Core] Free CPU pinned memory on environment cleanup (#10477 )	2025-01-21 11:56:41 -08:00
youkaichao	ad34c0df0f	[core] platform agnostic executor via collective_rpc (#11256 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-15 13:45:21 +08:00
youkaichao	310aca88c9	[perf]fix current stream (#11870 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-09 07:18:21 +00:00
cennn	9e764e7b10	[distributed] remove pynccl's redundant change_state (#11749 )	2025-01-06 09:05:48 +08:00
cennn	635b897246	[distributed] remove pynccl's redundant stream (#11744 )	2025-01-05 23:09:11 +08:00
Yan Burman	300acb8347	[Core][Bugfix] Use correct device to initialize GPU data during CUDA-graph-capture (#11233 ) Signed-off-by: Yan Burman <yanburman@users.noreply.github.com> Signed-off-by: Ido Asraff <idoa@atero.ai>	2025-01-04 14:50:16 +08:00
youkaichao	b12e87f942	[platforms] enable platform plugins (#11602 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-30 20:24:45 +08:00
youkaichao	62de37a38e	[core][distributed] initialization from StatelessProcessGroup (#10986 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-12 09:04:19 +00:00
Kuntai Du	0590ec3fd9	[Core] Implement disagg prefill by StatelessProcessGroup (#10502 ) This PR provides initial support for single-node disaggregated prefill in 1P1D scenario. Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Co-authored-by: ApostaC <yihua98@uchicago.edu> Co-authored-by: YaoJiayi <120040070@link.cuhk.edu.cn>	2024-12-01 19:01:00 -06:00
Sage Moore	9a88f89799	custom allreduce + torch.compile (#10121 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-11-25 22:00:16 -08:00
Yan Ma	f10797c0ce	[Bugfix][XPU] Fix xpu tp by introducing XpuCommunicator (#10144 ) Signed-off-by: yan ma <yan.ma@intel.com>	2024-11-08 09:41:03 +00:00
Russell Bryant	098f94de42	[CI/Build] Drop Python 3.8 support (#10038 ) Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-06 14:31:01 +00:00
Konrad Zawora	a02a50e6e5	[Hardware][Intel-Gaudi] Add Intel Gaudi (HPU) inference backend (#6143 ) Signed-off-by: yuwenzho <yuwen.zhou@intel.com> Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: Bob Zhu <bob.zhu@intel.com> Signed-off-by: zehao-intel <zehao.huang@intel.com> Signed-off-by: Konrad Zawora <kzawora@habana.ai> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com> Co-authored-by: Sanju C Sudhakaran <scsudhakaran@habana.ai> Co-authored-by: Michal Adamczyk <madamczyk@habana.ai> Co-authored-by: Marceli Fylcek <mfylcek@habana.ai> Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com> Co-authored-by: Vivek Goel <vgoel@habana.ai> Co-authored-by: yuwenzho <yuwen.zhou@intel.com> Co-authored-by: Dominika Olszewska <dolszewska@habana.ai> Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com> Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com> Co-authored-by: Jan Kaniecki <jkaniecki@habana.ai> Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com> Co-authored-by: Krzysztof Wisniewski <kwisniewski@habana.ai> Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com> Co-authored-by: Ilia Taraban <tarabanil@gmail.com> Co-authored-by: Chendi.Xue <chendi.xue@intel.com> Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai> Co-authored-by: Jakub Maksymczuk <jmaksymczuk@habana.ai> Co-authored-by: Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com> Co-authored-by: Sun Choi <schoi@habana.ai> Co-authored-by: Iryna Boiko <iboiko@habana.ai> Co-authored-by: Bob Zhu <41610754+czhu15@users.noreply.github.com> Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com> Co-authored-by: Zehao Huang <zehao.huang@intel.com> Co-authored-by: Andrzej Kotłowski <Andrzej.Kotlowski@intel.com> Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com> Co-authored-by: Nir David <ndavid@habana.ai> Co-authored-by: Yu-Zhou <yu.zhou@intel.com> Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai> Co-authored-by: Karol Damaszke <kdamaszke@habana.ai> Co-authored-by: Marcin Swiniarski <mswiniarski@habana.ai> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by: Jacek Czaja <jacek.czaja@intel.com> Co-authored-by: Jacek Czaja <jczaja@habana.ai> Co-authored-by: Yuan <yuan.zhou@outlook.com>	2024-11-06 01:09:10 -08:00
youkaichao	96e0c9cbbd	[torch.compile] directly register custom op (#9896 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-10-31 21:56:09 -07:00
Yan Ma	04a3ae0aca	[Bugfix] Fix multi nodes TP+PP for XPU (#8884 ) Signed-off-by: YiSheng5 <syhm@mail.ustc.edu.cn> Signed-off-by: yan ma <yan.ma@intel.com> Co-authored-by: YiSheng5 <syhm@mail.ustc.edu.cn>	2024-10-29 21:34:45 -07:00
Yongzao	ad6f78053e	[torch.compile] expanding support and fix allgather compilation (#9637 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-10-24 01:32:15 -07:00
wangshuai09	3ddbe25502	[Hardware][CPU] using current_platform.is_cpu (#9536 )	2024-10-22 00:50:43 -07:00
Cody Yu	d11bf435a0	[MISC] Consolidate cleanup() and refactor offline_inference_with_prefix.py (#9510 )	2024-10-18 14:30:55 -07:00
youkaichao	663874e048	[torch.compile] improve allreduce registration (#9061 )	2024-10-04 16:43:50 -07:00
Kunshang Ji	d4bf085ad0	[MISC] add support custom_op check (#8557 ) Co-authored-by: youkaichao <youkaichao@126.com>	2024-09-20 19:03:55 -07:00
Cyrus Leung	6ffa3f314c	[CI/Build] Avoid CUDA initialization (#8534 )	2024-09-18 10:38:11 +00:00
youkaichao	99aa4eddaf	[torch.compile] register allreduce operations as custom ops (#8526 )	2024-09-16 22:57:57 -07:00
Cyrus Leung	7025b11d94	[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410 )	2024-08-13 05:33:41 +00:00
Rui Qiao	997cf78308	[Misc] Fix typo in GroupCoordinator.recv() (#7167 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2024-08-05 11:10:16 -07:00
Aurick Qiao	0437492ea9	PP comm optimization: replace send with partial send + allgather (#6695 ) Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>	2024-07-31 20:15:42 -07:00
Cyrus Leung	f230cc2ca6	[Bugfix] Fix broadcasting logic for `multi_modal_kwargs` (#6836 )	2024-07-31 10:38:45 +08:00
Woosuk Kwon	d09b94ca58	[TPU] Support collective communications in XLA devices (#6813 )	2024-07-27 01:45:57 +00:00
Li, Jiang	3bbb4936dc	[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125 )	2024-07-26 13:50:10 -07:00
Tyler Michael Smith	95db75de64	[Bugfix] Add synchronize to prevent possible data race (#6788 ) Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2024-07-25 10:40:01 -07:00
youkaichao	da78caecfa	[core][distributed] zmq fallback for broadcasting large objects (#6183 ) [core][distributed] add zmq fallback for broadcasting large objects (#6183)	2024-07-09 18:49:11 -07:00
youkaichao	3c6325f0fc	[core][distributed] custom allreduce when pp size > 1 (#6117 )	2024-07-03 14:41:32 -07:00
Murali Andoorveedu	c5832d2ae9	[Core] Pipeline Parallel Support (#4412 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>	2024-07-02 10:58:08 -07:00
Cyrus Leung	99397da534	[CI/Build] Add TP test for vision models (#5892 )	2024-06-29 15:45:54 +00:00
xwjiang2010	b90d8cd832	[Distributed] Make it clear that % should not be in tensor dict keys. (#5927 ) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>	2024-06-28 15:20:22 +00:00
xwjiang2010	74d55c065b	[VLM][BugFix] Make sure that `multi_modal_kwargs` can broadcast properly with ring buffer. (#5905 ) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-06-28 07:29:13 +00:00
xwjiang2010	d12af207d2	[VLM][Bugfix] Make sure that `multi_modal_kwargs` is broadcasted properly (#5880 ) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>	2024-06-27 15:15:24 +08:00
Woo-Yeon Lee	2ce5d6688b	[Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414 )	2024-06-25 09:56:06 +00:00
Murali Andoorveedu	5d4d90536f	[Distributed] Add send and recv helpers (#5719 )	2024-06-23 14:42:28 -07:00
youkaichao	d9a252bc8e	[Core][Distributed] add shm broadcast (#5399 ) Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>	2024-06-21 05:12:35 +00:00
Kunshang Ji	728c4c8a06	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 ) Co-authored-by: Jiang Li <jiang1.li@intel.com> Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>	2024-06-17 11:01:25 -07:00
youkaichao	d1c3d7d139	[misc][distributed] fix benign error in `is_in_the_same_node` (#5512 )	2024-06-14 10:59:28 -07:00
youkaichao	ea3890a5f0	[Core][Distributed] code deduplication in tp&pp with coordinator(#5293 ) [Core][Distributed] add coordinator to reduce code duplication in tp and pp (#5293)	2024-06-12 17:27:08 -07:00

1 2

62 Commits