Anexdeus
2b03137fca
Merge branch 'mlm-full-lora-support' of https://github.com/jeejeelee/vllm into mlm-full-lora-support
2025-12-20 20:40:31 +03:00
bk-201
cb72a0ef01
fix pre-commit
...
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-12-20 16:36:13 +00:00
bk-201
68116edfe2
fix bug
...
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-12-20 16:20:12 +00:00
Anexdeus
c6831e793d
extended SupportsMultiModal
2025-12-20 17:22:41 +03:00
Anexdeus
cd32aeadfa
Merge branch 'jeejeelee:mlm-full-lora-support' into mlm-full-lora-support
2025-12-20 15:29:40 +03:00
Anexdeus
d525556a25
Revert the mixin changes
2025-12-20 13:31:53 +03:00
Anexdeus
b03d1a04a8
added ProcessingInfoMixin for QwenVL series models
2025-12-20 12:29:46 +03:00
Jee Jee Li
e5ba472ae2
Merge branch 'main' into mlm-full-lora-support
2025-12-20 15:19:28 +08:00
bk-201
4c2e95ad56
correct f-string formatting
...
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-12-20 06:23:33 +00:00
bk-201
9c9950c080
fix
...
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-12-20 04:05:59 +00:00
Lucas Wilkinson
ff2168bca3
[CI] FIx fixture 'siglip_attention_config' not found ( #31053 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-20 03:46:15 +00:00
Gregory Shtrasberg
0be149524c
[ROCm][CI/Build] Update ROCm dockerfiles ( #30991 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-12-20 03:19:12 +00:00
Jee Jee Li
d053aa73e1
Fix
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-12-20 01:47:11 +00:00
zejunchen-zejun
d52c5096d7
[Bugfix] fix the alias bug of AttentionBackendEnum when register CUSTOM attention backend to vllm ( #30869 )
...
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
2025-12-20 09:03:35 +08:00
Jee Jee Li
463074fac8
Merge branch 'main' into mlm-full-lora-support
2025-12-20 08:25:41 +08:00
Yuxuan Zhang
8a7a414374
GLM-4.7 Tool Parser and Doc Update ( #30876 )
...
Signed-off-by: zRzRzRzRzRzRzR <2448370773@qq.com>
2025-12-20 00:09:58 +00:00
Robert Shaw
95befecc18
[MoE Refactor][2/N] Use Modular Kernels for Fp8 ( #30825 )
...
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2025-12-19 23:36:38 +00:00
Wentao Ye
4cf9429897
[Bug] Fix error 'Dynamo failed to run FX node with fake tensors for Deepseek V3.2 ( #31046 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-12-19 23:31:31 +00:00
Robert Shaw
83a317f650
[MoE Refactor][3/N] Deprecate cutlass block quant fp8 (b200) ( #30990 )
...
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
2025-12-19 13:09:54 -08:00
Lucas Wilkinson
5f6477d1d0
[BugFix] Fix TypeError: unhashable type: 'dict' when serving deepseek32 ( #30924 )
...
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-19 16:07:54 -05:00
Wentao Ye
3bd8335bd0
[Refactor] Refactor for DeepGemmQuantScaleFMT using cache ( #30898 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-12-19 13:50:39 -07:00
Seiji Eicher
1ab5213531
Make engine core client handshake timeout configurable ( #27444 )
...
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
2025-12-19 20:38:30 +00:00
Zhonghua Deng
969bbc7c61
[Model] Add MiMo-V2-Flash support ( #30836 )
...
Signed-off-by: Abatom <abzhonghua@gmail.com>
Signed-off-by: Jumiar <liuanqim10@126.com>
Signed-off-by: Zyann7 <zyann7@outlook.com>
Co-authored-by: Jumiar <liuanqim10@126.com>
Co-authored-by: Zyann7 <zyann7@outlook.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2025-12-19 17:17:03 +00:00
bk-201
764aa45140
fix bug
...
Signed-off-by: bk-201 <joy25810@foxmail.com>
2025-12-19 16:57:25 +00:00
Andrey Talman
268a972c62
Update Pytorch version update docs ( #30982 )
2025-12-19 16:08:53 +00:00
Jinzhen Lin
5fbfa8d9ef
[Quantization] fix marlin w8a8 check ( #30961 )
...
Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com>
2025-12-19 07:33:22 -08:00
Shanshan Shen
23a1946e3b
[CustomOp][Refactor] Extract common methods for ApplyRotaryEmb CustomOp ( #31021 )
...
Signed-off-by: shen-shanshan <467638484@qq.com>
2025-12-19 22:16:09 +08:00
Thomas Parnell
b5545d9d5c
[Bugfix] [Kernel] Triton attention kernels: mask out V blocks that fall outside sliding window ( #30887 )
...
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2025-12-19 21:39:54 +08:00
Nishidha Panpaliya
bd2b52fc2d
[CPU][Bugfix] Fix ppc64le CPU build ( #30871 )
...
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
2025-12-19 12:26:35 +00:00
Li, Jiang
420ba2dbb6
Enable aarch64 CPU performance benchmarks ( #26494 )
...
Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: Ioana Ghiban <ioana.ghiban@arm.com>
Co-authored-by: Fadi Arafeh <fadi.arafeh@arm.com>
2025-12-19 12:16:18 +00:00
Marko Rosenmueller
455949675d
[Frontend][Bug] allow tool calls in analysis channel ( #28139 )
...
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-12-19 10:47:44 +00:00
lif
086b96339f
[Bugfix] Add validation for tool requests when tool_parser is unavailable ( #30613 )
...
Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 18:23:28 +08:00
Jinzhen Lin
9187de9fac
[Quantization] enable compressed-tensors marlin support for turing (2) ( #31008 )
...
Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com>
2025-12-19 08:56:35 +00:00
Isotr0py
ac1c934276
[Bugfix] Fix incorrect tiles creation for mm prefix triton attention ( #30974 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-19 16:00:33 +08:00
Wenqi Glantz
4924ac582c
Add hidden dimension validation for multimodal embedding inputs ( #30968 )
...
Signed-off-by: Wenqi Glantz <wglantz@nvidia.com>
2025-12-19 07:59:36 +00:00
Li, Jiang
096b25c9ed
[Doc][CPU] Fix index link for CPU regular release wheels ( #31015 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-12-19 07:29:52 +00:00
Jinzhen Lin
de08b8f61b
[Quantization] enable compressed-tensors marlin support for turing ( #31000 )
...
Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com>
2025-12-18 20:29:48 -08:00
Nick Hill
2ac85a4544
[BugFix] Fix logprobs with spec decode and modified logits ( #30846 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-18 19:58:28 -08:00
Andreas Karatzas
7b43db210c
[ROCm][CI][Bugfix] Multi-Modal Model Support Fixes and Attention Backend Improvements ( #30270 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-19 02:17:27 +00:00
PlatinumGod
6a09612b2e
[Bugfix] Fix tool_choice="none" being ignored by GPT-OSS/harmony models ( #30867 )
...
Signed-off-by: yujiepu <pyjapple@gmail.com>
Signed-off-by: PlatinumGod <pyjapple@gmail.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
2025-12-19 09:34:27 +08:00
Nick Hill
45c0526ac9
[BugFix] Handle errors when preprocessing added requests ( #30895 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-19 01:29:11 +00:00
Benjamin Chislett
d6b3d39b6d
[Cleanup] Refactor FlashInferMetadataBuilder ( #29128 )
...
Signed-off-by: Benjamin Chislett <bchislett@nvidia.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-12-18 14:45:30 -08:00
Chendi.Xue
6ca74bc11a
[NIXL][BUG FIX] Fix both failing issue and accuracy issue with nixl + host_buffer on CUDA ( #30419 )
...
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
2025-12-18 22:10:02 +00:00
Harry Mellor
19c583398a
Check for truthy rope_parameters not the existence of it ( #30983 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-18 13:59:10 -08:00
Nick Hill
b0b77c4655
[BugFix] Fix spec decode + structured outputs + preemption edge case ( #30916 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-12-18 12:59:55 -08:00
Kayvan Mivehnejad
634a14bd7d
Strengthen input validation and tests for 'parse_raw_prompts’. ( #30652 )
...
Signed-off-by: Kayvan Mivehnejad <K.Mivehnejad@gmail.com>
2025-12-18 19:51:58 +00:00
Chen Zhang
24b65eff0d
[BugFix] Spec decode with VLLM_ENABLE_V1_MULTIPROCESSING=0 ( #30319 )
...
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-12-18 19:47:56 +00:00
Elizabeth Thomas
41b6f9200f
Remove all2all backend envvar ( #30363 )
...
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-18 19:46:28 +00:00
Wentao Ye
97000a2be7
[Bug] Fix compressed tensor not using deepgemm ( #30820 )
...
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-12-18 14:45:55 -05:00
Isotr0py
d2dc5dfc6e
[Bugfix] Remove tile_size=64 for mm_prefix triton attention ( #30973 )
...
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-12-18 20:42:32 +01:00