xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-07 08:37:12 +08:00

Author	SHA1	Message	Date
majiayu000	abd1dbc548	[Bugfix] Preserve original tokenizer class name in CachedTokenizer HuggingFace transformers processor validates tokenizer type by checking the class name. When vLLM creates a CachedTokenizer with a modified class name (e.g., 'CachedQwen2TokenizerFast'), the processor type check fails with TypeError. This fix preserves the original tokenizer class name and qualname in CachedTokenizer, ensuring compatibility with HuggingFace transformers processor type checking. Fixes #31080 Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: majiayu000 <1835304752@qq.com>	2025-12-24 16:02:48 +08:00
Ryan Rock	ddfac7034e	[CI/Build] Ignore data_parallel_size_local (#30281 ) Signed-off-by: Ryan Rock <ryan.rock@amd.com>	2025-12-24 07:40:54 +00:00
Micah Williamson	6559d96796	[ROCm][CI] Set TORCH_NCCL_BLOCKING_WAIT Distributed Tests On ROCm (#31259 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-12-24 07:19:07 +00:00
kliuae	1c74150bca	[ROCm][CI] Fix "Distributed Tests (H200)" Test (#31227 ) Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>	2025-12-24 06:56:30 +00:00
Andreas Karatzas	0247a91e00	[ROCm][CI] Fix entrypoints tests and Python-only installation test on ROCm (#28979 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-23 22:42:30 -08:00
Michael Goin	8ee90c83f8	Add `--max-model-len auto` to auto-fit context to available memory (#29431 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-23 21:37:14 -08:00
Nick Cao	d7e05ac743	[docker] Fix downloading sccache on aarch64 platform (#30070 ) Signed-off-by: Nick Cao <nickcao@nichi.co>	2025-12-23 21:36:33 -08:00
sihao_li	471ddb99a0	[XPU] Remove distributed_executor_backend check (#30760 ) Signed-off-by: sihao.li <sihao.li@intel.com> Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>	2025-12-23 21:34:33 -08:00
Xiong Wang	bb24592d13	[Qwen3-Omni] fixed _get_feat_extract_output_lengths function (#31007 ) Signed-off-by: Xiong Wang <wangxiongts@163.com> Signed-off-by: Roger Wang <hey@rogerw.io> Co-authored-by: Roger Wang <hey@rogerw.io>	2025-12-23 21:33:54 -08:00
Matthew Bonanni	369f47aa0f	[DeepSeek v3.2] Remove unnecessary syncwarps (#31047 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-12-23 21:33:30 -08:00
zejunchen-zejun	dabff12ed3	[Bugfix][ROCm][Dynamo][DS 3.1][FP8] fix unsupported hasattr call when Dynamo tracing for ROCm device (#31149 ) Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>	2025-12-23 21:32:19 -08:00
Ming Yang	3bb9561928	Revert "[bench] Support common prefix len config (for decode-only bench)" (#31240 ) Signed-off-by: Ming Yang <minos.future@gmail.com>	2025-12-23 21:17:23 -08:00
Micah Williamson	3ce791ac77	[ROCm][CI] Set VLLM_FLOAT32_MATMUL_PRECISION="tf32" For terratorch Tests In AMD CI (#31242 ) Signed-off-by: Micah Williamson <micah.williamson@amd.com>	2025-12-24 03:21:50 +00:00
Andreas Karatzas	e42894f5b5	[ROCm][CI][Bugfix] Fix Siglip2 rotary embedding dispatch and InternVL video test tolerance (#31235 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-24 02:56:58 +00:00
Wentao Ye	76e6a95192	[Bug] Fix `Number of dimensions of tensors must match.` for Deepseek V3.2 (#31160 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-12-24 10:41:09 +08:00
Chao Lei	8b59753cdb	[P/D] Mooncake connector support more protocols (#30133 ) Signed-off-by: LCAIZJ <leichao139636@163.com>	2025-12-24 10:24:07 +08:00
Chen Zhang	538e830caa	[KVEvent] User request.block_hash for parent block_hash (#30544 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu> Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>	2025-12-23 18:23:43 -08:00
rongfu.leng	4ed11105d7	[Misc] Remove unused custom ops `copy_blocks` and `copy_blocks_mla` (#30967 ) Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>	2025-12-23 18:22:35 -08:00
Cyrus Leung	dd424571c8	[Bugfix] Enable `dynamic_dims` for different embeds shape (#31223 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-24 10:15:47 +08:00
Cyrus Leung	ca6a95ba25	[Chore] Simplify logic of `_execute_mm_encoder` (#31222 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-23 18:15:16 -08:00
Vadim Gimpelson	bc0a5a0c08	[CI] Add Qwen3-Next-FP8 to Blackwell model tests (#31049 ) Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>	2025-12-23 17:21:50 -08:00
Andreas Karatzas	bfa2c0bbb9	[ROCm][Bugfix] Fix RuntimeError in MMEncoderAttention by replacing .view() with .reshape() (#31203 ) Signed-off-by: Andreas Karatzas <akaratza@amd.com>	2025-12-23 21:48:01 +00:00
Mark McLoughlin	f790068600	[Core] Add a random suffix to frontend-provided request IDs (#27987 ) Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2025-12-23 13:05:39 -08:00
Asaf Joseph Gardin	34916ae37f	[Mamba] - Consolidate Mambas Attention Logic (#28133 )	2025-12-23 21:57:00 +01:00
Yuan Tang	0736f901e7	docs: Add llm-d integration to the website (#31234 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-12-23 20:27:22 +00:00
Harry Mellor	c016c95b45	Use helper function instead of looping through attribute names (#29788 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-23 17:31:56 +00:00
Harry Mellor	1339878e13	Only patch `original_max_position_embeddings` for Transformers v4 (#31214 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-23 16:46:32 +00:00
danielafrimi	b94f80ffb8	[FIX] FP4 quantization kernel padding initialization bug (#31097 ) Signed-off-by: <> Co-authored-by: root <root@gpu-193.slurm-workers-slurm.slurm.svc.cluster.local> Co-authored-by: root <root@gpu-951.slurm-workers-slurm.slurm.svc.cluster.local>	2025-12-23 08:45:18 -08:00
Joachim Studnia	38c361f99d	Fix edge case Mistral tool parser (#30724 ) Signed-off-by: Joachim Studnia <joachim@mistral.ai> Signed-off-by: Joachim Studnia <studniajoachim@gmail.com> Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: juliendenize <julien.denize@mistral.ai> Co-authored-by: Julien Denize <40604584+juliendenize@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2025-12-23 14:19:58 +00:00
Cyrus Leung	bb62dda2c3	[Misc] Introduce `encode_*_url` utility function (#31208 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-23 13:45:21 +00:00
Patrick von Platen	3faa8bee57	adapt voxtral (#31095 ) Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>	2025-12-23 05:31:55 -08:00
Harry Mellor	b10d47e0e0	Add util function for checking nesting of rope parameters (#31146 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-12-23 11:41:49 +00:00
R3hankhan	769f27e701	[OpenAI] Add parameter metadata to validation errors (#30134 ) Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>	2025-12-23 11:30:12 +00:00
Jakub Zakrzewski	23daef548d	[Frontend] Support using chat template as custom score template for reranking models (#30550 ) Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com> Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by: wang.yuqi <noooop@126.com> Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>	2025-12-23 11:19:16 +00:00
Jee Jee Li	27c6c2f98c	[Bugfix] Fix MoE LoRA bin/pt loading (#31161 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-23 19:09:15 +08:00
Weida Hong	73cfb7a722	Correct position of docstring of class attributes (#31209 ) Signed-off-by: Weida Hong <wdhongtw@google.com>	2025-12-23 02:08:58 -08:00
vllmellm	f32cfd7d97	[ROCm][FEAT] Support AITER RMSNorm quantization fusion pass (#26575 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>	2025-12-23 02:07:54 -08:00
Jee Jee Li	6b16fff01b	[Bugfix] Fix Jais2ForCausalLM (#31198 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-12-23 07:44:01 +00:00
Yan Ma	f1c2c20136	[XPU] decrease IGC_ForceOCLSIMDWidth for speculative decoding triton-xpu kernel compilation (#30538 ) Signed-off-by: Yan Ma <yan.ma@intel.com>	2025-12-23 05:22:15 +00:00
Cyrus Leung	8cef137689	[Chore] Update more locations to use `attention_config.backend` (#31153 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-12-22 19:19:50 -08:00
quanliu	a37328fc5c	[Feature] Batch invariant: Lora (#30097 ) Signed-off-by: quanliu <18646313696@163.com>	2025-12-23 10:32:47 +08:00
Pavani Majety	3e10262356	Revert "[SM100] Enable fp8 compute for prefill MLA (#30746 )" (#31197 ) Signed-off-by: Pavani Majety <pmajety@nvidia.com>	2025-12-22 18:15:33 -08:00
Angela Yi	612d5ffdab	[ci] Fix Pytorch compilation test oom in 2.10 (#31194 ) Signed-off-by: angelayi <yiangela7@gmail.com>	2025-12-23 01:56:47 +00:00
Divakar Verma	78e5e62bbf	[AMD][CI] fix v1/engine test_preprocess_error_handling (#31192 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com>	2025-12-23 01:28:19 +00:00
Robert Shaw	b57b967386	[MoE Refactor][7/N] AITER MK (#31102 ) Signed-off-by: Robert Shaw <robshaw@redhat.com> Co-authored-by: Robert Shaw <robshaw@redhat.com>	2025-12-22 16:42:58 -07:00
Michael Goin	6d518ffbaa	[CI Failure] Disable mosaicml/mpt-7b and databricks/dbrx-instruct tests (#31182 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-22 15:40:35 -08:00
Benjamin Chislett	85aff45e24	[Perf] Remove blocking copy in GDN Attention (#31167 ) Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>	2025-12-22 14:25:22 -08:00
Wentao Ye	5312a7284e	[Bug] Fix `'CutlassMLAImpl' object has no attribute '_workspace_buffer'` (#31173 ) Signed-off-by: yewentao256 <zhyanwentao@126.com>	2025-12-22 14:24:27 -08:00
Lucas Wilkinson	de71747655	[SpecDecode] Simplified alternative padded-speculation acceptance rate fix (#29845 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-12-22 13:06:10 -08:00
Michael Goin	9586354053	[Doc] Add vllm-metal to hardware plugin documentation (#31174 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-12-22 20:06:29 +00:00

1 2 3 4 5 ...

12510 Commits