xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-05 12:07:13 +08:00

Author	SHA1	Message	Date
ゆり	d02ed59762	Merge 2843784c1cc1ca8118d416a1902ed334a0dde3d2 into 254f6b986720c92ddf97fbb1a6a6465da8e87e29	2025-12-25 00:07:16 +00:00
Richard Zou	254f6b9867	[Bugfix] Fix eagle dp tests on A100 (#31241 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-12-25 00:05:04 +00:00
Michael Goin	bc5ef333e0	[Perf] Add skip_clone to SamplingParams for internal request handling (#31041 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-24 14:35:57 -08:00
Cyrus Leung	09dc7c690c	[Chore][1/2] Drop `v0.14` deprecations (#31285 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 09:54:01 -08:00
ゆり	506eb0f454	[Bugfix] Remove dead `block_quant_to_tensor_quant` function (#31294 ) Co-authored-by: yurekami <yurekami@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-24 17:22:48 +00:00
Ning Xie	5d93089686	[cli] complete vllm cli help message (#31226 ) Signed-off-by: Andy Xie <andy.xning@gmail.com>	2025-12-24 15:45:47 +00:00
Kevin McKay	66c9887440	[Bugfix][Hardware][AMD] Fix FP8 dtype in silu_mul quantization (#31179 ) Signed-off-by: c0de128 <kevin.mckay@outlook.com>	2025-12-24 10:37:11 -05:00
wang.yuqi	1ff67df182	[CI] Reorganization pooling_mteb_test (#31265 ) Signed-off-by: wang.yuqi <noooop@126.com>	2025-12-24 23:36:20 +08:00
yurekami	2843784c1c	fix: handle None tokenizer in multimodal processor initialization When skip_tokenizer_init=True is set, the tokenizer is None. Previously, this None value was unconditionally passed to the processor, which overrode the processor's ability to load its own tokenizer from the model path. This caused crashes in multimodal models like gemma-3 that require a tokenizer during processor initialization. The fix is to only pass the tokenizer kwarg when it's not None, allowing the processor to load its own tokenizer when skip_tokenizer_init=True. Fixes #31123 Signed-off-by: yurekami <69337011+yurekami@users.noreply.github.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: yurekami <yurekami@users.noreply.github.com>	2025-12-24 23:41:40 +09:00
skaraban3807	7cd288a4b3	[PERF] Add interleaved memory allocation to NUMA module (#30800 )	2025-12-24 13:47:49 +00:00
Cyrus Leung	d201807339	[Chore] Bump `lm-eval` version (#31264 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 05:39:13 -08:00
Cyrus Leung	aa3868ecfe	[Chore] Remove unused `noqa`s (#31263 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 05:38:46 -08:00
Cyrus Leung	7adeb4bfa8	[Bugfix] Fix `max_model_len="auto"` handling (#31260 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 19:15:27 +08:00
wang.yuqi	bd89ce16d2	[Model] Introduce verify_and_update_model_config for VerifyAndUpdateConfig. (#31131 ) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com>	2025-12-24 09:54:57 +00:00
Pleaplusone	b41aeb3468	[Bugfix][ROCm] Fix load issue on deepseek quark quantization when shared expert enabled (#31261 ) Signed-off-by: ganyi <ygan@amd.com>	2025-12-24 16:47:44 +08:00
Ryan Rock	ddfac7034e	[CI/Build] Ignore data_parallel_size_local (#30281 ) Signed-off-by: Ryan Rock <ryan.rock@amd.com>	2025-12-24 07:40:54 +00:00
Micah Williamson	6559d96796	[ROCm][CI] Set TORCH_NCCL_BLOCKING_WAIT Distributed Tests On ROCm (#31259 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-12-24 07:19:07 +00:00
kliuae	1c74150bca	[ROCm][CI] Fix "Distributed Tests (H200)" Test (#31227 ) Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>	2025-12-24 06:56:30 +00:00
Andreas Karatzas	0247a91e00	[ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm (#28979 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-23 22:42:30 -08:00
Michael Goin	8ee90c83f8	Add `--max-model-len auto` to auto-fit context to available memory (#29431 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-23 21:37:14 -08:00
Nick Cao	d7e05ac743	[docker] Fix downloading sccache on aarch64 platform (#30070 ) Signed-off-by: Nick Cao <nickcao@nichi.co>	2025-12-23 21:36:33 -08:00
sihao_li	471ddb99a0	[XPU] Remove distributed_executor_backend check (#30760 ) Signed-off-by: sihao.li <sihao.li@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-12-23 21:34:33 -08:00
Xiong Wang	bb24592d13	[Qwen3-Omni] fixed _get_feat_extract_output_lengths function (#31007 ) Signed-off-by: Xiong Wang <wangxiongts@163.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-12-23 21:33:54 -08:00
Matthew Bonanni	369f47aa0f	[DeepSeek v3.2] Remove unnecessary syncwarps (#31047 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-12-23 21:33:30 -08:00
zejunchen-zejun	dabff12ed3	[Bugfix][ROCm][Dynamo][DS 3.1][FP8] fix unsupported hasattr call when Dynamo tracing for ROCm device (#31149 ) Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>	2025-12-23 21:32:19 -08:00
Ming Yang	3bb9561928	Revert "[bench] Support common prefix len config (for decode-only bench)" (#31240 ) Signed-off-by: Ming Yang <minos.future@gmail.com>	2025-12-23 21:17:23 -08:00
Micah Williamson	3ce791ac77	[ROCm][CI] Set VLLM_FLOAT32_MATMUL_PRECISION="tf32" For terratorch Tests In AMD CI (#31242 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-12-24 03:21:50 +00:00
Andreas Karatzas	e42894f5b5	[ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance (#31235 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-24 02:56:58 +00:00
Wentao Ye	76e6a95192	[Bug] Fix `Number of dimensions of tensors must match.` for Deepseek V3.2 (#31160 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-12-24 10:41:09 +08:00
Chao Lei	8b59753cdb	[P/D] Mooncake connector support more protocols (#30133 ) Signed-off-by: LCAIZJ <leichao139636@163.com>	2025-12-24 10:24:07 +08:00
Chen Zhang	538e830caa	[KVEvent] User request.block_hash for parent block_hash (#30544 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>	2025-12-23 18:23:43 -08:00
rongfu.leng	4ed11105d7	[Misc] Remove unused custom ops `copy_blocks` and `copy_blocks_mla` (#30967 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-12-23 18:22:35 -08:00
Cyrus Leung	dd424571c8	[Bugfix] Enable `dynamic_dims` for different embeds shape (#31223 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 10:15:47 +08:00
Cyrus Leung	ca6a95ba25	[Chore] Simplify logic of `_execute_mm_encoder` (#31222 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-23 18:15:16 -08:00
Vadim Gimpelson	bc0a5a0c08	[CI] Add Qwen3-Next-FP8 to Blackwell model tests (#31049 ) Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>	2025-12-23 17:21:50 -08:00
Andreas Karatzas	bfa2c0bbb9	[ROCm][Bugfix] Fix RuntimeError in MMEncoderAttention by replacing .view() with .reshape() (#31203 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-23 21:48:01 +00:00
Mark McLoughlin	f790068600	[Core] Add a random suffix to frontend-provided request IDs (#27987 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-12-23 13:05:39 -08:00
Asaf Joseph Gardin	34916ae37f	[Mamba] - Consolidate Mambas Attention Logic (#28133 )	2025-12-23 21:57:00 +01:00
Yuan Tang	0736f901e7	docs: Add llm-d integration to the website (#31234 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-12-23 20:27:22 +00:00
Harry Mellor	c016c95b45	Use helper function instead of looping through attribute names (#29788 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-23 17:31:56 +00:00
Harry Mellor	1339878e13	Only patch `original_max_position_embeddings` for Transformers v4 (#31214 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-23 16:46:32 +00:00
danielafrimi	b94f80ffb8	[FIX] FP4 quantization kernel padding initialization bug (#31097 ) Signed-off-by: <> Co-authored-by: root <root@gpu-193.slurm-workers-slurm.slurm.svc.cluster.local> Co-authored-by: root <root@gpu-951.slurm-workers-slurm.slurm.svc.cluster.local>	2025-12-23 08:45:18 -08:00
Joachim Studnia	38c361f99d	Fix edge case Mistral tool parser (#30724 ) Signed-off-by: Joachim Studnia <joachim@mistral.ai> Signed-off-by: Joachim Studnia <studniajoachim@gmail.com> Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2025-12-23 14:19:58 +00:00
Cyrus Leung	bb62dda2c3	[Misc] Introduce `encode_*_url` utility function (#31208 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-23 13:45:21 +00:00
Patrick von Platen	3faa8bee57	adapt voxtral (#31095 ) Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>	2025-12-23 05:31:55 -08:00
Harry Mellor	b10d47e0e0	Add util function for checking nesting of rope parameters (#31146 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-23 11:41:49 +00:00
R3hankhan	769f27e701	[OpenAI] Add parameter metadata to validation errors (#30134 ) Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>	2025-12-23 11:30:12 +00:00
Jakub Zakrzewski	23daef548d	[Frontend] Support using chat template as custom score template for reranking models (#30550 ) Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-23 11:19:16 +00:00
Jee Jee Li	27c6c2f98c	[Bugfix] Fix MoE LoRA bin/pt loading (#31161 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-23 19:09:15 +08:00
Weida Hong	73cfb7a722	Correct position of docstring of class attributes (#31209 ) Signed-off-by: Weida Hong <wdhongtw@google.com>	2025-12-23 02:08:58 -08:00

1 2 3 4 5 ...

12524 Commits