xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-05 06:09:09 +08:00

Author	SHA1	Message	Date
Huy Do	e7ef74e26e	Fix some issues with benchmark data output (#13641 ) Signed-off-by: Huy Do <huydhn@gmail.com>	2025-02-24 10:23:18 +08:00
youkaichao	eb24dc4a45	[v1] torchrun compatibility (#13642 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-23 22:47:24 +08:00
Kevin H. Luu	2c5e637b57	[ci] Use env var to control whether to use S3 bucket in CI (#13634 )	2025-02-22 19:19:45 -08:00
youkaichao	3e472d882a	[core] set up data parallel communication (#13591 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-22 19:28:59 +08:00
Harry Mellor	992e5c3d34	Merge similar examples in `offline_inference` into single `basic` example (#12737 )	2025-02-20 04:53:51 -08:00
Kevin H. Luu	88f6ba3281	[ci] Add AWS creds for AMD (#13572 )	2025-02-20 03:56:06 +00:00
Yannick Schnider	423330263b	[Feature] Pluggable platform-specific scheduler (#13161 ) Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com> Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>	2025-02-19 17:16:38 +08:00
Lucia Fang	f525c0be8b	[Model][Speculative Decoding] DeepSeek MTP spec decode (#12755 ) Signed-off-by: Lu Fang <fanglu@fb.com> Co-authored-by: LiuXiaoxuanPKU <lilyliupku@gmail.com>	2025-02-19 17:06:23 +08:00
Kevin H. Luu	3b05cd4555	[perf-benchmark] Fix ECR path for premerge benchmark (#13512 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-19 07:56:11 +00:00
Kevin H. Luu	9aa95b0e6a	[perf-benchmark] Allow premerge ECR (#13509 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-19 05:13:41 +00:00
Harry Mellor	00b69c2d27	[Misc] Remove dangling references to `--use-v2-block-manager` (#13492 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-02-19 03:37:26 +00:00
Huy Do	45186834a0	Run v1 benchmark and integrate with PyTorch OSS benchmark database (#13068 ) Signed-off-by: Huy Do <huydhn@gmail.com>	2025-02-17 08:16:32 +00:00
Nicolò Lucchesi	d84cef76eb	[Frontend] Add `/v1/audio/transcriptions` OpenAI API endpoint (#12909 )	2025-02-13 07:23:45 -08:00
Kevin H. Luu	9f9704dca6	[perf-benchmark] cleanup unused Docker images and volumes in H100 benchmark instance (#12706 )	2025-02-12 19:51:33 -08:00
Kevin H. Luu	842b0fd402	[ci] Add more source file dependencies for some tests (#13123 ) Signed-off-by: <> Co-authored-by: EC2 Default User <ec2-user@ip-172-31-20-117.us-west-2.compute.internal>	2025-02-11 20:38:10 -08:00
youkaichao	aa0ca5ebb7	[core][rlhf] add colocate example for RLHF (#12984 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-10 10:28:59 +08:00
Cyrus Leung	8a69e0e20e	[CI/Build] Auto-fix Markdown files (#12941 )	2025-02-08 04:25:15 -08:00
Liangfu Chen	c45d398e6f	[CI] Resolve transformers-neuronx version conflict (#12925 )	2025-02-08 01:41:35 -08:00
Robert Shaw	932c6b7461	[V1] LM Eval With Streaming Integration Tests (#11590 )	2025-02-07 15:07:03 -08:00
Rahul Tuli	3b2005e1db	Add: Support for Sparse24Bitmask Compressed Models	2025-02-05 13:30:43 -08:00
youkaichao	bc1bdecebf	[core][distributed] exact ray placement control (#12732 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-06 02:03:19 +08:00
Arthur	a1a2aaadb9	[Model]: Add `transformers` backend support (#11330 ) # Adds support for `transformers` as a backend Following https://github.com/huggingface/transformers/pull/35235, a bunch of models should already be supported, we are ramping up support for more models. Thanks @Isotr0py for the TP support, and @hmellor for his help as well! This includes: - `trust_remote_code=True` support: any model on the hub, if it implements attention the correct way can be natively supported!! - tensor parallel support --------- Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Isotr0py <41363108+Isotr0py@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-02-03 21:30:38 +08:00
youkaichao	1298a400e8	[ci/build] fix gh200 test (#12681 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-03 15:59:49 +08:00
youkaichao	20579c0fae	make sure mistral_common not imported for non-mistral models (#12669 ) When people use deepseek models, they find that they need to solve cv2 version conflict, see https://zhuanlan.zhihu.com/p/21064432691 . I added the check, and make all imports of `cv2` lazy. --------- Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-02-03 13:40:25 +08:00
Russell Bryant	e489ad7a21	[Misc] Add SPDX-License-Identifier headers to python source files (#12628 ) - Add SPDX license headers to python source files - Check for SPDX headers using pre-commit commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745 Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:18:24 2025 -0500 Add SPDX license headers to python source files This commit adds SPDX license headers to python source files as recommended to the project by the Linux Foundation. These headers provide a concise way that is both human and machine readable for communicating license information for each source file. It helps avoid any ambiguity about the license of the code and can also be easily used by tools to help manage license compliance. The Linux Foundation runs license scans against the codebase to help ensure we are in compliance with the licenses of the code we use, including dependencies. Having these headers in place helps that tool do its job. More information can be found on the SPDX site: - https://spdx.dev/learn/handling-license-info/ Signed-off-by: Russell Bryant <rbryant@redhat.com> commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea Author: Russell Bryant <rbryant@redhat.com> Date: Fri Jan 31 14:36:32 2025 -0500 Check for SPDX headers using pre-commit Signed-off-by: Russell Bryant <rbryant@redhat.com> --------- Signed-off-by: Russell Bryant <rbryant@redhat.com>	2025-02-02 11:58:18 -08:00
Kevin H. Luu	415f19474d	[release] Add input step to ask for Release version (#12631 ) Instead of having to create a new build with release version put in as env var.	2025-01-31 13:39:36 -08:00
fenghuizhang	80fcc3ed1c	[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (#12482 ) Signed-off-by: Fenghui Zhang <fhzhang@google.com>	2025-01-28 22:36:44 +00:00
Liangfu Chen	ddee88d0ff	[Neuron][Kernel] NKI-based flash-attention kernel with paged KV cache (#11277 ) Signed-off-by: Liangfu Chen <liangfc@amazon.com> Co-authored-by: Jiangfei Duan <jfduan@outlook.com>	2025-01-27 17:31:16 -08:00
Bowen Wang	2bc3fbba0c	[FlashInfer] Upgrade to 0.2.0 (#11194 ) Signed-off-by: Bowen Wang <abmfy@icloud.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2025-01-27 18:19:24 +00:00
youkaichao	e784c6b998	[ci/build] sync default value for wheel size (#12398 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-24 17:54:29 +08:00
youkaichao	c7c9851036	[ci/build] fix wheel size check (#12396 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-24 17:31:25 +08:00
youkaichao	68ad4e3a8d	[Core] Support fully transparent sleep mode (#11743 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-22 14:39:32 +08:00
Liangfu Chen	016e3676e7	[CI] add docker volume prune to neuron CI (#12291 ) Signed-off-by: Liangfu Chen <liangfc@amazon.com>	2025-01-22 10:47:49 +08:00
youkaichao	2fc6944c5e	[ci/build] disable failed and flaky tests (#12240 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-21 13:25:03 +08:00
Harry Mellor	3ea7b94523	Move linting to `pre-commit` (#11975 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-20 14:58:01 +08:00
Isotr0py	02798ecabe	[Model] Port deepseek-vl2 processor, remove dependency (#12169 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-01-18 13:59:39 +08:00
youkaichao	87a0c076af	[core] allow callable in collective_rpc (#12151 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-17 20:47:01 +08:00
Li, Jiang	d4e6194570	[CI/Build][CPU][Bugfix] Fix CPU CI (#12150 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-01-17 19:39:52 +08:00
Kunshang Ji	fead53ba78	[CI]add genai-perf benchmark in nightly benchmark (#10704 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-01-17 04:15:09 +00:00
youkaichao	92e793d91a	[core] LLM.collective_rpc interface and RLHF example (#12084 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-16 20:19:52 +08:00
youkaichao	bf53e0c70b	Support torchrun and SPMD-style offline inference (#12071 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-16 19:58:53 +08:00
youkaichao	ff39141a49	[HPU][misc] add comments for explanation (#12034 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-14 19:24:06 +08:00
Konrad Zawora	078da31903	[HPU][Bugfix] set_forward_context and CI test execution (#12014 ) Signed-off-by: Konrad Zawora <kzawora@habana.ai>	2025-01-14 11:04:18 +08:00
Sungjae Lee	80ea3af1a0	[CI][Spec Decode] fix: broken test for EAGLE model (#11972 ) Signed-off-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>	2025-01-13 06:50:35 +00:00
Akshat Tripathi	8bddb73512	[Hardware][CPU] Multi-LoRA implementation for the CPU backend (#11100 ) Signed-off-by: Akshat Tripathi <akshat@krai.ai> Signed-off-by: Oleg Mosalov <oleg@krai.ai> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Oleg Mosalov <oleg@krai.ai> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-01-12 13:01:52 +00:00
Isotr0py	f967e51f38	[Model] Initialize support for Deepseek-VL2 models (#11578 ) Signed-off-by: Isotr0py <2037008807@qq.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-01-12 00:17:24 -08:00
Cyrus Leung	7a3a83e3b8	[CI/Build] Move model-specific multi-modal processing tests (#11934 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-11 13:50:05 +08:00
Harry Mellor	482cdc494e	[Doc] Rename offline inference examples (#11927 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-10 23:50:29 +08:00
youkaichao	241ad7b301	[ci] Fix sampler tests (#11922 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-10 20:45:33 +08:00
Harry Mellor	d85c47d6ad	Replace "online inference" with "online serving" (#11923 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-10 12:05:56 +00:00

1 2 3 4 5 ...

408 Commits