xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-05 01:37:27 +08:00

Author	SHA1	Message	Date
wangxiyuan	c3ee80a01a	[V0 deprecation]clean up is_v1_supported_oracle (#28116 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-06 16:05:32 +08:00
Zhewen Li	0b8e871e5e	[CI/Build] Fix `test_defaults_with_usage_context` in AMD CI (#27926 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-11-05 15:40:24 -08:00
Snehlata	e15601789b	[Feature]: Add corrupted request metric to V1 metrics system. (#27306 ) Signed-off-by: atalhens <sneh.lata@nutanix.com>	2025-11-05 13:45:29 -08:00
Paul Zhang	faedbb4d4f	[Feature] Extend batch invariant torch.compile to B200 (#27856 ) Signed-off-by: PaulZhang12 <paulzhan@fb.com>	2025-11-05 10:04:49 -08:00
Samuel Shen	40db194446	[CI]: Add LMCacheConnector Unit Tests (#27852 ) Signed-off-by: Samuel Shen <slshen@uchciago.edu> Co-authored-by: Samuel Shen <slshen@uchciago.edu> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>	2025-11-05 09:45:57 -08:00
Isotr0py	3f5a4b6473	[Bugfix] Validate custom logits processor xargs for online serving (#27560 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-11-05 16:53:33 +00:00
Kuntai Du	86dca07d9b	[Hybrid allocator + kv connector] revert connector test changes related to hybrid allocator (#28011 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu>	2025-11-05 10:36:31 +00:00
wangxiyuan	428bc7bf1c	[V0 deprecation] Remove VLLM_USE_V1 usage in most modules (#27955 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-11-04 20:51:16 -08:00
Nick Hill	938a81692e	[AsyncScheduling] Don't schedule past request max_tokens (#27922 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-04 17:06:28 +00:00
Nick Hill	c9f66da8fd	[PerfFix] Avoid separate thread for MP executor shm spin (#28012 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-04 08:33:55 -08:00
Mark McLoughlin	58279c60b5	[KV Connector] Make KVCacheConfig an explicit constructor argument (#27887 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-11-03 23:00:49 -08:00
Matthew Bonanni	01baefe674	Add TP parameter to attention tests (#27683 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-11-03 13:04:40 -08:00
Aurick Qiao	2c19d96777	[Spec Decode] Integrate Suffix Decoding from Arctic Inference (#25784 ) Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>	2025-11-03 09:23:31 -08:00
Lucas Wilkinson	4bc400f47e	[CI/Testing] Add basic single node dual batch overlap test (#27235 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-11-03 17:00:46 +00:00
Rémi Delacourt	cec7c28833	[Bugfix] Padded Eagle Specdec with Chunked Prefill (#26263 ) Signed-off-by: Rémi Delacourt <remi@mistral.ai> Signed-off-by: Rémi Delacourt <54138269+Flechman@users.noreply.github.com> Signed-off-by: remi <remi@mistral.ai> Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>	2025-11-03 02:22:46 -05:00
Biswa Panda	1bf43ae35d	[BugFix][LoRA] use adapter_id instead of id field of lora_request (#27728 ) Signed-off-by: Biswa Panda <biswa.panda@gmail.com>	2025-11-03 10:08:08 +08:00
Yihua Cheng	e675118849	[Add] cmdline argument parsing for KV cache offloading modules (#27621 ) Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-01 07:17:07 +00:00
Nick Hill	0cdbe7b744	[Core] Async scheduling + structured outputs compatibility (#26866 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-01 00:35:04 +00:00
Chen Zhang	df334868ca	[Hybrid] A simpler algorithm to find kernel_block_size (#26476 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-10-31 21:30:28 +00:00
Matthew Bonanni	f29aeb5a25	Add FLASHINFER_MLA to test_mla_backends and add B200 CI run (#27663 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-10-31 11:12:19 -07:00
GuanLuo	d6517be3cd	[Bugfix] Missing NIXL metadata for handshake initialization if instance spans multi-node (#26338 ) Signed-off-by: Guan Luo <gluo@nvidia.com> Signed-off-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com> Signed-off-by: Guan Luo <41310872+GuanLuo@users.noreply.github.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>	2025-10-31 10:16:00 -07:00
Zhewen Li	0fe0140408	[KV offload] Enable CPU KV offload on CUDA alike Platforms (#27770 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-30 22:10:29 +08:00
Lucas Wilkinson	b5d70751d8	[BugFix] Reordering extend logic fix (#27739 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-10-29 21:39:34 -07:00
Nick Hill	2ce5c5d3d6	[BugFix] Handle unscheduled requests properly when async scheduling (#27756 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-10-29 21:04:25 -07:00
Nicolò Lucchesi	0f95a1c3f2	[CI] Fix flaky `test_two_responses_with_same_prev_id` test (#27745 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-10-29 15:10:35 +00:00
Zhewen Li	9a0d2f0d92	[CI/Build] Skip cpu offloading test on AMD (#27690 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-29 12:55:51 +00:00
Dipika Sikka	413ef7a3b4	[Speculators] Move tests + fix integration (#27308 ) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Rahul Tuli <rtuli@redhat.com> Signed-off-by: rahul-tuli <rtuli@redhat.com> Co-authored-by: Rahul Tuli <rtuli@redhat.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>	2025-10-29 00:54:21 -07:00
Nick Hill	4fe5895361	[AsyncScheduling] Make async overlap work with logprobs (#27615 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-10-28 22:35:54 +00:00
Or Ozeri	111faf1118	[Core] Scheduler: Publish connector events after output (#25875 ) Signed-off-by: Or Ozeri <oro@il.ibm.com>	2025-10-28 21:01:33 +00:00
Wentao Ye	6afc28a9ba	[Test] Batch Invariant: Unit test using parameterized backend (#27478 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-10-28 13:51:35 -07:00
Lucas Wilkinson	141e6a0505	[Misc] Make reorder batch also separate extends (#27367 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-10-28 10:55:10 -07:00
Mohammad Miadh Angkad	a8c02fb5bf	[Bugfix][CI] Fix v1 attention backend tests and add CI coverage (#26597 ) Signed-off-by: Mohammad Miadh Angkad <MAngkad.BSDSBA2027@aim.edu> Signed-off-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>	2025-10-28 11:42:05 -04:00
Yeshwanth N	71b1c8b667	[Chore]:Extract math and argparse utilities to separate modules (#27188 ) Signed-off-by: Yeshwanth Surya <yeshsurya@gmail.com> Signed-off-by: Yeshwanth N <yeshsurya@gmail.com> Signed-off-by: yeshsurya <yeshsurya@gmail.com>	2025-10-26 04:03:32 -07:00
Kuntai Du	b853540388	[Core][Hybrid allocator + kv connector 1/n] Enable hybrid allocator + KV cache connector (#25712 ) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>	2025-10-24 23:34:18 -07:00
Jiangyun Zhu	29c9cb8007	[CI] Add tests for cudagraph (#27391 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>	2025-10-25 02:37:33 +00:00
kourosh hakhamaneshi	7e1d697b56	[Bugfix] Fix MultiConnector stats reconstruction across process boundaries (#27366 ) Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>	2025-10-24 17:08:05 +00:00
Jonathan Chen	ca76486a16	[Chore] Separate out `vllm.utils.platform_utils.py` (#27374 ) Signed-off-by: Jonathan <chenleejonathan@gmail.com>	2025-10-23 19:08:06 +00:00
Tova Movshovitz	88afa11010	[Metrics] [KVConnector] Add connector prefix cache hit rate stats (#26245 ) Signed-off-by: tovam <tovam@pliops.com>	2025-10-23 12:21:08 +02:00
Zhewen Li	50b788a17a	[CI/Build] Fix AMD CI: test_cpu_gpu.py (#27388 ) Signed-off-by: zhewenli <zhewenli@meta.com>	2025-10-23 07:55:00 +00:00
Giancarlo Delfin	6644796bf4	[V1][spec decode] return logprobs for spec decoding (#26060 ) Signed-off-by: Giancarlo Delfin <gdelfin@meta.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-10-22 22:59:59 -07:00
Andrew Sansom	ff93cc8c84	[CORE] Support Prefix Caching with Prompt Embeds (#27219 ) Signed-off-by: Andrew Sansom <andrew@protopia.ai>	2025-10-22 22:18:07 -07:00
dongbo910220	a0003b56b0	[Chore] Separate out system utilities from vllm.utils (#27201 ) Signed-off-by: dongbo910220 <1275604947@qq.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-10-22 20:25:25 +00:00
Sage	1651003c35	[Prefix Cache] Use LoRA name for consistent KV-cache block hashing (#27211 ) Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>	2025-10-22 18:13:03 +00:00
Nicolò Lucchesi	4dfdb821c8	[P/D] Dynamic `kv_output_aggregator` collect size (#26734 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-10-22 18:07:58 +02:00
Russell Bryant	58fab50d82	[Frontend] Require flag for loading text and image embeds (#27204 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-22 15:52:02 +00:00
Mark McLoughlin	4ca13a8667	[NIXL] Terminate handshake listener thread in shutdown (#26404 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-10-22 16:59:53 +02:00
Nicolò Lucchesi	bfa59be8f1	[CI] Nixl integration tests DP-EP (#27199 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-10-22 11:17:48 +08:00
Tyler Michael Smith	6c2eef5a5d	[P/D] KVConnector for decode benchmarking (#25986 ) Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com> Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2025-10-21 16:30:47 -07:00
ExtReMLapin	4a8a567e16	Updated xgrammar backend to not deny supported string formats (#27253 ) Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com> Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-10-21 22:25:23 +00:00
Huy Do	becb7de40b	Update PyTorch to 2.9.0+cu129 (#24994 ) Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-10-21 17:20:18 -04:00

1 2 3 4 5 ...

667 Commits