xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-23 05:24:25 +08:00

Author	SHA1	Message	Date
wang.yuqi	7a80b01889	[CI] Resettle pooling entrypoints tests. (#29370 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-11-25 10:39:10 +00:00
Ben Browning	e1dd706cd1	[Frontend] Respect Chat Completion parallel_tool_calls param (#26233 ) Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>	2025-11-25 09:56:15 +00:00
wang.yuqi	67fc16cd8c	[Bugfix] If chunked_prefill is disabled, end the scheduling early. (#28911 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-11-25 16:06:09 +08:00
elvischenv	6330f9477d	[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-11-25 07:59:40 +00:00
Micah Williamson	ef1f7030f0	[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI (#29367 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-11-25 07:55:09 +00:00
Rémi Delacourt	12c007e288	EAGLE Support DP>1 (#26086 ) Signed-off-by: Rémi Delacourt <remi@mistral.ai> Signed-off-by: Rémi Delacourt <54138269+Flechman@users.noreply.github.com> Signed-off-by: remi <remi@mistral.ai>	2025-11-25 07:32:21 +00:00
vllmellm	64deead719	[Bugfix] [ROCm] [UX]: revert Flex attention backend (#29371 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-11-25 06:56:06 +00:00
Harry Mellor	316c8492bf	Scheduled removal of `guided_*` config fields (#29326 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 05:24:05 +00:00
Isotr0py	92effb07a4	[Model] Add HunyuanOCR support (#29327 ) Signed-off-by: manayang <jackmanayang@gmail.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: sergeywang <sergeywang@tencent.com> Co-authored-by: manayang <jackmanayang@gmail.com> Co-authored-by: manayang <manayang@tencent.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-25 03:28:51 +00:00
Mark McLoughlin	9cf4edae6e	[Metrics] Scheduled removal of deprecated metrics (#29330 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-11-25 11:15:13 +08:00
Pleaplusone	77e10c9cab	[Perf][Deepseek] optimize gather_and_maybe_dequant_cache kernel's perf for extremely long sequence (#28029 ) Signed-off-by: ganyi <ygan@amd.com>	2025-11-24 19:05:46 -07:00
Chen Zhang	71df2a57ef	[Hybrid Allocator] Better layer padding strategy for gpt-oss eagle (#29303 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-11-24 14:28:32 -08:00
Nick Hill	84371daf75	[Tests] Verify gpt_oss package is installed in harmony tests (#29336 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-24 22:04:31 +00:00
Chenheli Hua	839c6b7b72	[Multimodal][Qwen3 Omni] Make Qwen3 Omni work with audio-in-video inputs in V1 engine. (#27721 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-24 19:24:37 +00:00
bnellnm	8f066146c3	[MoE][Refactor] Make select_experts a non-static method (#29067 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-11-24 13:38:04 -05:00
Aydin Abiar	656516c315	[Bugfix] properly handle nested json with llama3 tool parser (#27701 ) Signed-off-by: Aydin Abiar <aydin@anyscale.com> Signed-off-by: Aydin Abiar <62435714+Aydin-ab@users.noreply.github.com> Co-authored-by: Aydin Abiar <aydin@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>	2025-11-24 15:28:51 +00:00
vllmellm	e48b2e6848	[Bugfix] [ROCm] [UX] Reorganize ROCm Backend Selection Logic (#26980 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-11-24 15:24:49 +00:00
Laith Sakka	7a228b5305	Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. (#26199 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-11-24 10:12:41 -05:00
WeiQing Chen	2601f18a82	[EPLB] Optimize EPLB for Async Rearrange Experts (#22179 ) Signed-off-by: David Chen <530634352@qq.com> Co-authored-by: SunChenxiang123 <1291824390@qq.com>	2025-11-24 09:08:29 -05:00
Roger Wang	0ff70821c9	[Core] Deprecate `xformers` (#29262 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-24 04:18:55 +00:00
Zero	30854783ad	[Model] Add OpenCUA-7B support (#29068 ) Signed-off-by: lim4349 <rockmanzero@naver.com> Signed-off-by: Zero <rockmanzero@naver.com> Co-authored-by: Cloud User <ubuntu@a100-80g-4.novalocal> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-24 10:27:55 +08:00
Jee Jee Li	1073ba68b0	[LoRA] Optimize 3D MoE logic (#29222 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-11-24 10:27:23 +08:00
Micah Williamson	55c21c8836	[ROCm][CI] Fix "Cannot re-initialize CUDA in forked subprocess" in test_pynccl.py (#29119 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-11-23 13:05:00 +08:00
rasmith	3999442f1c	[CI/Build][AMD] Add check for flash_att_varlen_func to test_tree_attention.py (#29252 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-23 04:45:08 +00:00
rasmith	71362ffab4	[CI/Build][AMD] Skip test_multi_shared_storage_connector_consistency in test_multi_connector.py due to hipErrorLaunchFailure when calling .cpu() (#29253 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-23 04:42:49 +00:00
Cyrus Leung	389aa1b2eb	[Doc] Update more docs with respect to V1 (#29188 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-23 10:58:48 +08:00
Nick Hill	7df331c66b	[BugFix] Fix chunked prompt logprobs + preemption (#29071 )	2025-11-22 16:07:18 -05:00
Nick Hill	d44a63c6d6	[BugFix] Fix returned logprobs with spec decode + prefill chunking (#29216 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-22 22:41:25 +08:00
Nicolò Lucchesi	066209a045	[Attention] Refactor FA `block_size` limitations to hybrid models only (#29084 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-11-22 06:38:44 -08:00
Cyrus Leung	5a4802588e	[Misc] Further clean up chunked prefill and prefix caching init (#29186 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-22 19:34:15 +08:00
rasmith	8e22da1d7f	[CI/Build Don't add FLASHINFER backend in test_cpu_offloading.py (#29229 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-22 11:00:54 +00:00
rasmith	a4fdf2405c	[CI/Build] Skip tests that require libcudart in test_lmcache_integration.py (#29228 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-22 10:59:39 +00:00
Andrew Xia	742e9ff6b3	[responsesAPI] parse reasoning item input (#28248 ) Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-22 15:42:11 +08:00
rasmith	fd65015a14	[CI/Build] Only use supported types and features on ROCm in MoE kernel tests (#29149 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-21 20:34:33 -07:00
rasmith	6f403501a0	[CI/Build][AMD] Enable Entrypoints Integration Test (Pooling) to run without error on ROCm (#29212 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-22 02:13:18 +00:00
Lucas Wilkinson	30d6466238	[BugFix] Fix Eagle `IndexError: list index out of range` for even `num_speculative_tokens` (#29102 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-11-22 00:47:05 +00:00
Mark McLoughlin	c6fa3895e9	[KV Connector] Fix async connector prefix cache metrics (#28585 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>	2025-11-21 17:45:00 -05:00
Varun Sundar Rabindranath	3137991f55	[BugFix] EPLB + B200 + DeepGEMM : Handle column-major scales tensor (#29162 ) Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>	2025-11-21 14:28:17 -08:00
Julien Denize	57430fc95c	Default model load/config/tokenizer to `mistral` format if relevant files exist (#28659 ) Signed-off-by: Julien Denize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: mgoin <mgoin64@gmail.com>	2025-11-21 13:58:59 -08:00
Wentao Ye	1f400c58b8	[CI] Add batch invariant test to ci (#27842 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-21 09:20:33 -07:00
rasmith	711241c13c	[CI/Build] Fix illegal memory access and unsupported test in kernels/attention/test_cache.py (#29118 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-21 10:58:38 -05:00
Cyrus Leung	aab0102a26	[V0 deprecation] Remove more V0 references (#29088 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-21 11:56:59 +00:00
WeiQing Chen	b34129bf8e	[Misc] remove useless v1 env (#29164 ) Signed-off-by: David Chen <530634352@qq.com>	2025-11-21 01:41:20 -08:00
Alex Brooks	b4734b9550	[Bugfix] Fix default MM LoRA alignment for single str prompts (#29140 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2025-11-21 13:32:30 +08:00
Jialin Ouyang	30b9c67743	Revert "[Redo] #26368 (#28771 )" (#29121 ) Signed-off-by: Jialin Ouyang <Jialin.Ouyang@gmail.com>	2025-11-20 21:27:45 -08:00
Cyrus Leung	56e96b37e4	[V0 Deprecation] Remove `best_of` (#29090 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-11-21 11:40:40 +08:00
jeremyteboul	0730414999	[Core] Add audio_embeds support to chat completions (#29059 ) Signed-off-by: Jeremy Teboul <jeremyteboul@fb.com> Co-authored-by: Jeremy Teboul <jeremyteboul@fb.com>	2025-11-21 11:39:47 +08:00
Jee Jee Li	9875be6431	[LoRA][2/2]Remove LoRA extra vocab (#28545 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-11-21 09:46:43 +08:00
Michael Goin	87cbbdff63	Update model references for OLMo3 (#29099 ) Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-11-21 09:16:52 +08:00
rasmith	c7a29d2c8d	[CI/Build] Remove skip global cleanup in test_struct_output_generate.py (#29022 ) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>	2025-11-20 21:44:37 +00:00

1 2 3 4 5 ...

3641 Commits