xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-30 01:23:31 +08:00

Author	SHA1	Message	Date
Harry Mellor	51fc9e017a	Scheduled removal of `CompilationConfig.use_inductor` (#29323 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 12:55:42 +00:00
Roger Wang	c2c661af9b	[Bugfix] Fix overallocation in MM profiling (#29386 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-25 12:38:36 +00:00
Nicolò Lucchesi	798e87db5c	[Core] Generalize Encoder-Decoder `seq_lens` computation to avoid Whisper hardcoded logic (#29268 ) Signed-off-by: NickLucche <nlucches@redhat.com> Co-authored-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>	2025-11-25 11:32:11 +00:00
wang.yuqi	de6889946b	[Misc] Suppress log outputs when constructing the default vllm config. (#29291 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 03:00:44 -08:00
Ben Browning	e1dd706cd1	[Frontend] Respect Chat Completion parallel_tool_calls param (#26233 ) Signed-off-by: Ben Browning <bbrownin@redhat.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>	2025-11-25 09:56:15 +00:00
Andrew Xia	a685b47c57	[responsesAPI] refactor construct_input_messages (#29359 ) Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>	2025-11-25 09:47:10 +00:00
Avishek Goswami	32c40b95e0	[BugFix] bad_words filtering ineffective when n > 1 (#29313 ) Signed-off-by: GOavi101 <1704178@kiit.ac.in>	2025-11-25 09:36:34 +00:00
Nick Hill	db2906108a	[Misc] Streamline unique id generation (#29375 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-25 08:30:11 +00:00
wang.yuqi	67fc16cd8c	[Bugfix] If chunked_prefill is disabled, end the scheduling early. (#28911 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-11-25 16:06:09 +08:00
elvischenv	6330f9477d	[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>	2025-11-25 07:59:40 +00:00
Micah Williamson	ef1f7030f0	[ROCm][CI] Fix test_cudagraph_mode failure in AMD CI (#29367 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-11-25 07:55:09 +00:00
Rémi Delacourt	12c007e288	EAGLE Support DP>1 (#26086 ) Signed-off-by: Rémi Delacourt <remi@mistral.ai> Signed-off-by: Rémi Delacourt <54138269+Flechman@users.noreply.github.com> Signed-off-by: remi <remi@mistral.ai>	2025-11-25 07:32:21 +00:00
zhrrr	f242cfcdd5	[Perf] use cpu all reduce to avoid sync when async_scheduling & dp > 1 (#29311 ) Signed-off-by: zhuhaoran <zhuhaoran.zhr@alibaba-inc.com>	2025-11-25 15:31:07 +08:00
Icey	888152bf87	Allow oot custom compiler extension via CompilerInterface (#28623 ) Signed-off-by: wxsIcey <1790571317@qq.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-11-25 15:25:15 +08:00
Fadi Arafeh	98caeadd54	[fix][cpu] Use a SwigluOAI impl which supports interleaved gate-up wei (#29273 ) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-11-25 15:11:11 +08:00
vllmellm	64deead719	[Bugfix] [ROCm] [UX]: revert Flex attention backend (#29371 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-11-25 06:56:06 +00:00
Nick Hill	7992324f23	[BugFix] Use unique ids for different transcription prompts (#29372 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-25 06:55:16 +00:00
kflu	ce58fdc1c3	Fix PoolingParams.skip_reading_prefix_cache type (#29364 ) Signed-off-by: KFL <kludev@gmail.com>	2025-11-25 06:39:29 +00:00
Harry Mellor	316c8492bf	Scheduled removal of `guided_*` config fields (#29326 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 05:24:05 +00:00
Lucas Wilkinson	2d9ee28cab	[CI/Test Fix] Fix CP tests on Blackwell (#29338 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-11-24 20:55:57 -08:00
Jiangyun Zhu	81db702ed2	[Attention] add `_cudagraph_support` for linear attention (#28934 ) Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>	2025-11-25 12:25:20 +08:00
Isotr0py	92effb07a4	[Model] Add HunyuanOCR support (#29327 ) Signed-off-by: manayang <jackmanayang@gmail.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: sergeywang <sergeywang@tencent.com> Co-authored-by: manayang <jackmanayang@gmail.com> Co-authored-by: manayang <manayang@tencent.com> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-25 03:28:51 +00:00
Maryam Tahhan	87185c88d5	[Bugfix] Make deprecated `--task embedding` consistent with `--runner… (#29312 ) Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>	2025-11-25 03:19:52 +00:00
Mark McLoughlin	9cf4edae6e	[Metrics] Scheduled removal of deprecated metrics (#29330 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com>	2025-11-25 11:15:13 +08:00
gbyu-amd	cb7214d8ea	[ROCm][MLA] enable fp8 MLA decode on ROCm (#28032 ) Signed-off-by: guanbao <gyu@amd.com> Signed-off-by: Guanbao Yu <gyu@amd.com> Signed-off-by: gbyu-amd <Guanbao.Yu@amd.com> Co-authored-by: guanbao <gyu@amd.com>	2025-11-25 10:15:02 +08:00
Pleaplusone	77e10c9cab	[Perf][Deepseek] optimize gather_and_maybe_dequant_cache kernel's perf for extremely long sequence (#28029 ) Signed-off-by: ganyi <ygan@amd.com>	2025-11-24 19:05:46 -07:00
Michael Goin	6f1355a1b7	[Perf] Disable DeepGEMM MoE by default when TP=8 is used (#29346 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-11-24 19:01:40 -07:00
Harry Mellor	a4ad43ad5a	Scheduled removal of `ParallelConfig`'s direct child EPLB fields (#29324 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-25 01:58:58 +00:00
Nick Hill	a178a0b40b	[BugFix] Fix duplicate id tool-call race condition (#29355 ) Signed-off-by: Nick Hill <nhill@redhat.com>	2025-11-25 01:54:26 +00:00
Hanjie Qiu	5f9679a43b	[Spec Decode] Add support for EAGLE3 heads that do not use_aux_hidden_states (#27688 ) Signed-off-by: hjjq <hanjieq@nvidia.com> Signed-off-by: Benjamin Chislett <bchislett@nvidia.com> Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>	2025-11-24 20:13:12 -05:00
Wentao Ye	699bca76c0	[UX] Raise error for attn backend of batch invariant (#29348 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-11-24 17:49:01 -07:00
Michael Goin	c17610e2ba	[Bugfix] Only use triton_kernels for MXFP4 on SM90 and SM100 (#29339 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-11-24 18:22:46 -05:00
Chen Zhang	71df2a57ef	[Hybrid Allocator] Better layer padding strategy for gpt-oss eagle (#29303 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-11-24 14:28:32 -08:00
Woosuk Kwon	f32c7d6f54	[Model Runner V2] Simplify Eagle bookkeeping with num_rejected (#29347 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-11-24 13:54:59 -08:00
Yan Ma	3cfa63ad99	[XPU]fix Kimi-VL-A3B-thinking on xpu (#29309 ) Signed-off-by: Yan Ma <yan.ma@intel.com>	2025-11-24 21:02:21 +00:00
Woosuk Kwon	97588c4d12	[Model Runner V2] Add minor clarification comments for Eagle (#29332 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-11-24 11:28:56 -08:00
Chenheli Hua	839c6b7b72	[Multimodal][Qwen3 Omni] Make Qwen3 Omni work with audio-in-video inputs in V1 engine. (#27721 ) Signed-off-by: Chenheli Hua <huachenheli@outlook.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-11-24 19:24:37 +00:00
bnellnm	8f066146c3	[MoE][Refactor] Make select_experts a non-static method (#29067 ) Signed-off-by: Bill Nell <bnell@redhat.com>	2025-11-24 13:38:04 -05:00
Woosuk Kwon	cec418b5df	[Model Runner V2] Change Numba AoT to JIT (#29328 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-11-24 09:34:37 -08:00
Woosuk Kwon	cc313cb73d	[Model Runner V2] Implement Single-step Eagle 1 (#29300 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-11-24 09:32:27 -08:00
Nicolò Lucchesi	26a465584a	[NIXL] Use config to enable telemetry + NIXL version bump (#29305 ) Signed-off-by: NickLucche <nlucches@redhat.com>	2025-11-24 17:18:04 +00:00
Aydin Abiar	656516c315	[Bugfix] properly handle nested json with llama3 tool parser (#27701 ) Signed-off-by: Aydin Abiar <aydin@anyscale.com> Signed-off-by: Aydin Abiar <62435714+Aydin-ab@users.noreply.github.com> Co-authored-by: Aydin Abiar <aydin@anyscale.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>	2025-11-24 15:28:51 +00:00
vllmellm	e48b2e6848	[Bugfix] [ROCm] [UX] Reorganize ROCm Backend Selection Logic (#26980 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>	2025-11-24 15:24:49 +00:00
Laith Sakka	7a228b5305	Add option to use unbacked, and backed size obl dynamic shapes for more sounds compilation. (#26199 ) Signed-off-by: Laith Sakka <lsakka@meta.com>	2025-11-24 10:12:41 -05:00
WeiQing Chen	2601f18a82	[EPLB] Optimize EPLB for Async Rearrange Experts (#22179 ) Signed-off-by: David Chen <530634352@qq.com> Co-authored-by: SunChenxiang123 <1291824390@qq.com>	2025-11-24 09:08:29 -05:00
Didier Durand	eca7a8fb59	[Doc]: fix typos in various files (#29230 ) Signed-off-by: Didier Durand <durand.didier@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-11-24 11:10:48 +00:00
杰兮	8005e606bf	[Bugfix][Rocm] Fix shared expert weight loading failure in DeepSeek-MTP (#27563 ) Signed-off-by: zhyajie <yajizhan@amd.com> Co-authored-by: zhyajie <yajizhan@amd.com>	2025-11-24 10:16:52 +00:00
rongfu.leng	68dfe28eae	[Feature][Benchmark] add --link-vars can filter when serve_param equal bench_param (#28909 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-11-24 02:02:28 -08:00
Roger Wang	0ff70821c9	[Core] Deprecate `xformers` (#29262 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-24 04:18:55 +00:00
tongqiu	5253f4276f	[ROCm] Support for Whisper v1 with Aiter Unified Attention and Aiter Flash Attention (#28376 ) Signed-off-by: apinge <Tong.Qiu2@amd.com>	2025-11-24 03:26:00 +00:00

1 2 3 4 5 ...

8126 Commits