Wentao Ye
|
ac1886588f
|
[CI] Fix re import error (#29973)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2025-12-03 15:16:54 -05:00 |
|
Yongtao Huang
|
2fc5d6e0d7
|
Fix LLMEngine.del dp_group cleanup condition (#29954)
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
|
2025-12-03 12:14:44 -08:00 |
|
elvischenv
|
afe9eb408e
|
[Bugfix] Fix flashinfer ar+norm kernel not available issue (#29960)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
|
2025-12-03 18:50:53 +00:00 |
|
Varun Sundar Rabindranath
|
19bee6d12d
|
[Performance][DP/EP] Add silu_mul_per_token_group_quant_fp8_colmajor kernel (#29470)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2025-12-03 18:04:59 +00:00 |
|
avigny
|
dd5d1ef780
|
[Bugfix] Mistral tool parser streaming update (#19425)
Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Jeff Cook <jeff@jeffcook.io>
Co-authored-by: sfbemerk <benjaminmerkel@mail.de>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-03 17:45:31 +00:00 |
|
Micah Williamson
|
d1f7392c5f
|
[ROCm][CI] Fix v1/logits_processors failure on ROCm (#29927)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2025-12-04 01:17:07 +08:00 |
|
Yu Jiaqi
|
9ae3c55b10
|
SigLIP example add chat_template (#29902)
Signed-off-by: piood <2477084691@qq.com>
|
2025-12-03 16:12:58 +00:00 |
|
Lumis Chen
|
9bcf92295a
|
[Core] Add xxHash as a high-performance hash option for accelerating prefix caching (#29163)
Signed-off-by: LuminolT <lumischen01@gmail.com>
Signed-off-by: Lumis Chen <lumischen01@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
|
2025-12-03 16:06:57 +00:00 |
|
rasmith
|
5aa9b09040
|
[CI/Build][AMD] Skip test_shared_storage_connector_hashes in test_shared_storage_connector.py due to hipErrorLaunchFailure when calling .cpu() (#29839)
Signed-off-by: Randall Smith <ransmith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>
|
2025-12-03 22:56:35 +08:00 |
|
ioana ghiban
|
1bb17ecb39
|
[CPU Backend] [Doc]: Update Installation Docs for CPUs (#29868)
Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>
|
2025-12-03 13:33:50 +00:00 |
|
ioana ghiban
|
15b1511a15
|
[GPU Backend] [Doc]: Remove duplicate statements on missing GPU wheels. (#29962)
Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>
|
2025-12-03 12:56:47 +00:00 |
|
Chauncey
|
b78772c433
|
[Frontend] supports deepseekv32 chat template (#29837)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-03 20:53:44 +08:00 |
|
Amr Mahdi
|
f5d3d93c40
|
[docker] Build CUDA kernels in separate Docker stage for faster rebuilds (#29452)
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
|
2025-12-03 11:41:53 +00:00 |
|
Fadi Arafeh
|
78f4bb0ba8
|
[DOC] Add Arm to list of compute resouces providers (#29894)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
|
2025-12-03 11:36:58 +00:00 |
|
HDCharles
|
b294e28db2
|
[refactor] CTMoEMethods to use QuantizationArgs (#28871)
Signed-off-by: HDCharles <charlesdavidhernandez@gmail.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 11:00:56 +00:00 |
|
Roger Wang
|
787b84a9fc
|
[Bugfix] Follow-up fix on MediaWithBytes (#29951)
Signed-off-by: Roger Wang <hey@rogerw.io>
|
2025-12-03 10:42:49 +00:00 |
|
Tsukasa OI
|
42c1949643
|
[Bugfix][Quantization] Support BF16 tensors on GGUF (#29948)
Signed-off-by: Tsukasa OI <floss_llm@irq.a4lg.com>
|
2025-12-03 10:33:46 +00:00 |
|
Isotr0py
|
cc4e296ea6
|
[CI/Build] Avoid duplicate empty inputs test for common multimodal generation tests (#29907)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 10:27:36 +00:00 |
|
Isotr0py
|
a21cd9ed23
|
[Bugfix] Fix incorrect image_grid_thw rank for HunyuanOCR from missing merge_by_field_config=True (#29950)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 10:05:10 +00:00 |
|
WeiQing Chen
|
7fe9c1a223
|
[CI] Add Async Eplb nightly CI tests (#29385)
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2025-12-03 09:51:08 +00:00 |
|
Chauncey
|
3f42b05fbc
|
[Refactor] [1/N] to simplify the vLLM serving architecture (#28040)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-12-03 01:26:39 -08:00 |
|
Yong Hoon Shin
|
69520bc695
|
Add logging for cudagraph related info (#29825)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
|
2025-12-03 01:01:48 -08:00 |
|
Andrew Xia
|
3a7751485b
|
[responsesAPI] support input output messages for non harmony models (#29549)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 23:59:23 -08:00 |
|
Cyrus Leung
|
bbfb55c29e
|
[Misc] Allow fetch_* utils to access local files by default (#29932)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-12-03 15:49:34 +08:00 |
|
JackieWu
|
0bec63fa31
|
[BugFix] fix imgs_pos in hunyuan_vl (#29879)
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-03 06:20:37 +00:00 |
|
elvischenv
|
c719c40540
|
[Bugfix] Defunctionalize TRTLLM AR+Norm op for avoiding extra clone kernel before it (#29631)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-03 05:15:50 +00:00 |
|
Russell Bryant
|
b08025a83b
|
[Docs] Discuss api key limitations in security guide (#29922)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-12-02 20:57:28 -08:00 |
|
Arpit Khandelwal
|
d7284a2604
|
[Core] Rename PassConfig flags as per RFC #27995 (#29646)
Signed-off-by: arpitkh101 <arpit5khandelwal@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-03 03:38:55 +00:00 |
|
Andreas Karatzas
|
506ed87e87
|
[ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues (#29909)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2025-12-03 10:36:49 +08:00 |
|
Roger Wang
|
4dd7978374
|
[Bugfix] Fix regression on pooling models from PR#29621 (#29921)
Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-03 10:33:45 +08:00 |
|
Lucas Wilkinson
|
5cdd664509
|
[BugFix] Fix assert in build_for_cudagraph_capture (#29893)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2025-12-02 16:56:54 -08:00 |
|
Alexei-V-Ivanov-AMD
|
5f67361fd1
|
Reverting re-direction to amd_mi355_X. (#29914)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-12-03 00:40:02 +00:00 |
|
maang-h
|
5d91d2b292
|
[Doc] Add allocate_slots parameter docs (#29777)
Signed-off-by: maang <maang_h@163.com>
Signed-off-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: Chen Zhang <zhangch99@outlook.com>
|
2025-12-02 23:23:09 +00:00 |
|
Micah Williamson
|
c014de1ec7
|
[ROCm][CI] Fix test_cudagraph_mode.py Failure For AMD CI (#29808)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2025-12-02 22:54:36 +00:00 |
|
Julien Denize
|
1b1e35aaf9
|
[BUGFIX] Fix regex pattern for Mistral Tool Call (#29918)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:58 -08:00 |
|
Julien Denize
|
5e5646e206
|
[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention (#29908)
Signed-off-by: juliendenize <julien.denize@mistral.ai>
|
2025-12-02 14:51:20 -08:00 |
|
Chauncey
|
0a9caca9f5
|
[Bugfix] fix --scheduling-policy=priority & n>1 crashes engine (#29764)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-12-02 22:42:28 +00:00 |
|
Sage Moore
|
e6f114ac25
|
[Bugfix][EPLB] Prevent user-provided EPLB config from being overwritten with defaults (#29911)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-12-02 13:20:22 -09:00 |
|
Harry Mellor
|
6fc5841db1
|
Fix some more Transformers nightly tests (#29872)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-02 21:49:44 +00:00 |
|
dependabot[bot]
|
3ff5b53bc2
|
Bump actions/setup-python from 6.0.0 to 6.1.0 (#29768)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2025-12-02 21:29:32 +00:00 |
|
jthomson04
|
1528e079e2
|
[Perf] Avoid pageable HtoD transfer in MinTokensLogitsProcessor (#29826)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
|
2025-12-02 21:25:52 +00:00 |
|
Divakar Verma
|
afb1e5b380
|
[CI][ROCm][tests/v1/e2e] Fix multiprocessing launch for the test (#29123)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
|
2025-12-02 20:46:10 +00:00 |
|
Copilot
|
1c593e117d
|
Fix boolean nested params, add dict format support, and enhance plotting for vllm bench sweep (#29025)
Signed-off-by: Luka Govedič <luka.govedic@gmail.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>
Co-authored-by: Luka Govedič <luka.govedic@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-12-02 20:40:56 +00:00 |
|
Navanit Dubey
|
a2b053dc85
|
feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE (#29896)
Signed-off-by: navanit-git <navanitdubey@gmail.com>
|
2025-12-02 19:28:35 +00:00 |
|
Matthew Bonanni
|
1d93f11675
|
[Attention][CUDAGraph] Remove CG padding from attention backends (#29352)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2025-12-02 13:48:08 -05:00 |
|
Benjamin Bartels
|
2d613de9ae
|
[CI/Build] Fixes missing runtime dependencies (#29822)
Signed-off-by: bbartels <benjamin@bartels.dev>
|
2025-12-02 10:21:49 -08:00 |
|
Alexei-V-Ivanov-AMD
|
c77b9929a0
|
Update AMD-CI testing mirror (as of 2025-12-02) (#29898)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-12-02 08:52:54 -09:00 |
|
Isotr0py
|
63b1da76ba
|
[Chore]: Reorganize gguf utils funtions under transformers_utils (#29891)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-12-02 17:33:23 +00:00 |
|
Andrew Xia
|
52cb349fc0
|
[responsesAPI][3] ResponsesParser to set up non harmony MCP (#29413)
Signed-off-by: Andrew Xia <axia@fb.com>
Co-authored-by: Andrew Xia <axia@fb.com>
|
2025-12-02 11:24:45 -05:00 |
|
Isotr0py
|
0ec8422171
|
[Bugfix] Fix incorrect channel order for idefics3 in edge case (#29881)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-02 16:03:52 +00:00 |
|