xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-20 01:37:11 +08:00

Author	SHA1	Message	Date
Isotr0py	6a39ba85fe	[Bugfix] Fix failing multimodal standard test (#22153 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-08-03 19:04:38 +00:00
Woosuk Kwon	6d98843b31	[Responses API] Disable response store by default (#22137 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-03 04:04:21 -07:00
David Ben-David	aefeea0fde	[V1] [P/D] Refactor KV Connector Path (#21980 ) Signed-off-by: David Ben-David <davidb@pliops.com> Co-authored-by: David Ben-David <davidb@pliops.com>	2025-08-03 04:03:40 -07:00
Isotr0py	3dddbf1f25	[Misc] Add tensor schema test coverage for multimodal models (#21754 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Isotr0py <2037008807@qq.com>	2025-08-03 00:52:14 -07:00
Cyrus Leung	f5d0f4784f	[Frontend] Improve error message for too many mm items (#22114 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-08-02 02:20:38 -07:00
Chih-Chieh Yang	b690e34824	[Model] Mamba2 preallocate SSM output tensor to avoid d2d copy overhead (#21075 ) Signed-off-by: Chih-Chieh Yang <7364402+cyang49@users.noreply.github.com> Signed-off-by: Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com>	2025-08-02 01:59:34 -07:00
Yuxuan Zhang	25373b6c6c	for glm-4.1V update (#22000 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com> Co-authored-by: Isotr0py <2037008807@qq.com>	2025-08-02 01:46:57 -07:00
Roger Wang	067c34a155	docs: remove deprecated disable-log-requests flag (#22113 ) Signed-off-by: Roger Wang <hey@rogerw.me>	2025-08-02 00:19:48 -07:00
Yong Hoon Shin	8564dc9448	Fix test_kv_sharing_fast_prefill flakiness (#22038 ) Signed-off-by: Yong Hoon Shin <yhshin@meta.com>	2025-08-01 23:55:34 -07:00
Rui Qiao	4ac8437352	[Misc] Getting and passing ray runtime_env to workers (#22040 ) Signed-off-by: Rui Qiao <ruisearch42@gmail.com>	2025-08-01 23:54:40 -07:00
Sage Moore	0edaf752d7	[Attention][DBO] Add support for "splitting" the CommonAttentionMetadata (#21153 ) Signed-off-by: Sage Moore <sage@neuralmagic.com>	2025-08-01 19:47:53 -07:00
Wentao Ye	6e8d8c4afb	[Test] Add Unit Test for Batched DeepGEMM (#21559 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-08-02 10:45:46 +08:00
Dipika Sikka	9f9c38c392	[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (#21835 ) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>	2025-08-01 19:43:37 -07:00
Michael Goin	88faa466d7	[CI] Initial tests for SM100 Blackwell runner (#21877 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-08-01 16:18:38 -07:00
XiongfeiWei	d84b97a3e3	Add lora test for tp>1 case for TPU. (#21970 ) Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>	2025-08-01 18:56:08 +00:00
Harry Mellor	38c8bce8b6	Enable headless models for pooling in the Transformers backend (#21767 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-01 10:31:29 -07:00
rongfu.leng	b879ecd6e2	[Bugfix] fix when skip tokenizer init (#21922 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-08-01 10:09:36 -07:00
Isotr0py	3f8e952179	[Bugfix] Fix glm4.1v video inference issue (#22067 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-08-01 09:33:30 -07:00
Harry Mellor	2d7b09b998	Deprecate `--disable-log-requests` and replace with `--enable-log-requests` (#21739 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-01 17:16:37 +01:00
Gamhang	0a6d305e0f	feat(multimodal): Add customizable background color for RGBA to RGB conversion (#22052 ) Signed-off-by: Jinheng Li <ahengljh@gmail.com> Co-authored-by: Jinheng Li <ahengljh@gmail.com>	2025-08-01 06:07:33 -07:00
Harry Mellor	fb0e0d46fc	Fix `get_kwargs` for case where type hint is `list[Union[str, type]]` (#22016 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-08-01 05:26:42 -07:00
Dipika Sikka	dfbc1f8880	[Speculative Decoding] Add `speculators` config support (#21345 )	2025-08-01 08:25:18 -04:00
Cyrus Leung	82de9b9d46	[Misc] Automatically resolve HF processor init kwargs (#22005 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-31 22:44:10 -07:00
Charent	ad57f23f6a	[Bugfix] Fix: Fix multi loras with tp >=2 and LRU cache (#20873 ) Signed-off-by: charent <19562666+charent@users.noreply.github.com>	2025-07-31 19:48:13 -07:00
Wentao Ye	3700642013	[Refactor] Remove Duplicate `per_block_cast_to_fp8`, Remove Dependencies of DeepGEMM (#21787 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-08-01 01:13:27 +00:00
Matthew Bonanni	e360316ab9	Add DeepGEMM to Dockerfile in vllm-base image (#21533 ) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-07-31 18:01:55 -07:00
Ilya Markov	6e672daf62	Add FlashInfer allreduce RMSNorm Quant fusion (#21069 ) Signed-off-by: ilmarkov <imarkov@redhat.com> Signed-off-by: ilmarkov <markovilya197@gmail.com> Co-authored-by: ilmarkov <imarkov@redhat.com>	2025-07-31 13:58:38 -07:00
Yong Hoon Shin	71470bc4af	[Misc] Add unit tests for chunked local attention (#21692 ) Signed-off-by: Yong Hoon Shin <yhshin@meta.com>	2025-07-31 11:39:16 -07:00
zhiweiz	9e0726e5bf	[Meta] Official Eagle mm support, first enablement on llama4 (#20788 ) Signed-off-by: morgendave <morgendave@gmail.com> Co-authored-by: Roger Wang <hey@rogerw.me>	2025-07-31 10:35:07 -07:00
Song	9484641616	[Model] Add step3 vl (#21998 ) Signed-off-by: oliveryuan <yuansong@step.ai> Co-authored-by: oliveryuan <yuansong@step.ai>	2025-07-31 23:19:06 +08:00
Nick Hill	5daffe7cf6	[BugFix] Fix case where `collective_rpc` returns `None` (#22006 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-31 12:51:37 +00:00
wang.yuqi	2836dd73f1	[Model][CI] Let more pooling models support v1 (#21747 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-31 01:51:15 -07:00
Ning Xie	3e36fcbee6	[Bugfix]: fix metadata file copy in test_sharded_state_loader (#21830 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-07-31 06:22:11 +00:00
Michael Goin	055bd3978e	[CI Bugfix] Fix CI OOM for `test_shared_storage_connector_hashes` (#21973 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-07-31 11:45:29 +08:00
Zebing Lin	ca9e2be3ed	[Core] Move EngineCoreRequest to Request conversion out of EngineCore (#21627 ) Signed-off-by: linzebing <linzebing1995@gmail.com>	2025-07-30 15:00:54 -07:00
cascade	287f527f54	[Feature] Add async tensor parallelism for scaled mm (#20155 ) Signed-off-by: cascade812 <cascade812@outlook.com>	2025-07-30 17:23:41 -04:00
Nick Hill	56bd537dde	[Misc] Support more collective_rpc return types (#21845 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-07-30 10:20:20 -07:00
wxsm	f4135232b9	feat(distributed): add `get_required_kvcache_layout` class method to kv connector api (#20433 ) Signed-off-by: wxsm <wxsms@foxmail.com>	2025-07-30 16:41:51 +00:00
Chenguang Zheng	4904e53c32	[Bugfix] SharedStorage Connector for V1 PD multimodal (#21611 ) Signed-off-by: fake0fan <645327136@qq.com> Signed-off-by: herotai214 <herotai214@gmail.com> Co-authored-by: herotai214 <herotai214@gmail.com>	2025-07-30 09:18:37 -07:00
Cyrus Leung	004203e953	[CI/Build] Fix registry tests (#21934 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-30 09:10:41 -07:00
633WHU	5c765aec65	[Bugfix] Fix TypeError in scheduler when comparing mixed request_id types (#21816 ) Signed-off-by: chiliu <chiliu@paypal.com> Co-authored-by: chiliu <chiliu@paypal.com>	2025-07-30 08:54:44 -07:00
Yong Hoon Shin	ad510309ee	Override attention metadata for fast prefill in some KV sharing setups (#21590 ) Signed-off-by: Yong Hoon Shin <yhshin@meta.com>	2025-07-30 08:54:15 -07:00
Isotr0py	6e599eebe8	[Bugfix] Fix OOM tests in initialization test (#21921 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2025-07-30 07:35:47 -07:00
Ruixiang Tan	8f4a1c9a04	[Misc] Improve code readability of KVCacheManager (#21673 ) Signed-off-by: tanruixiang <tanruixiang0104@gmail.com> Signed-off-by: Ruixiang Tan <819464715@qq.com> Signed-off-by: GitHub <noreply@github.com>	2025-07-30 07:20:43 -07:00
Wentao Ye	0271c2ff2f	[Test] Add Benchmark and Unit Test for `per_token_group_quant` (#21860 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-07-30 07:15:02 -07:00
Varun Vinayak Shenoy	547795232d	[Tests] Fixing bug inside MultiModalProfiler. (#21842 ) Signed-off-by: Varun Shenoy <varun.vinayak.shenoy@oracle.com>	2025-07-30 00:44:15 -07:00
wang.yuqi	65f311ce59	[Frontend] Add LLM.reward specific to reward models (#21720 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-07-29 20:56:03 -07:00
Chen Zhang	555e7225bc	[v1][attention] Support Hybrid Allocator + FlashInfer (#21412 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-07-30 01:45:29 +00:00
elvischenv	58b11b24a6	[Bugfix] Fix workspace buffer None issue for Flashinfer TRTLLM Backend (#21525 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-07-29 10:34:00 -04:00
Richard Zou	04e38500ee	[Bugfix] VLLM_V1 supports passing other compilation levels (#19340 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-07-29 09:35:58 -04:00

1 2 3 4 5 ...

2493 Commits