11282 Commits

Author SHA1 Message Date
Michael Goin
c9a3a02149
Add output token counting to gsm8k eval (#28594)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-14 09:32:03 +00:00
Nick Hill
bc3e43069a
[BugFix] Fix multi-modal async scheduling race condition (#28706)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-11-14 01:11:13 -08:00
Jiangyun Zhu
c36bcfe6b3
[Bugfix] fix dots.ocr pp support (#28705)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-11-14 09:01:26 +00:00
Yan Ma
529cea343d
use default CCL_ZE_IPC_EXCHANGE (#28700)
Signed-off-by: Yan Ma <yan.ma@intel.com>
2025-11-14 16:55:29 +08:00
rasmith
93103575ce
[BugFix][CI/Build][ROCM] Fix import error and apply assert in appropriate case in test_struct_output_generate (#28311)
Signed-off-by: Randall Smith <ransmith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>
2025-11-13 22:41:29 -08:00
rasmith
15ae8e0784
[Bugfix][CI/Test][Spec Decode] Fix illegal memory access in offline_inference/spec_decode.py (Issue 27619) (#28432)
Signed-off-by: Randall Smith <ransmith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
2025-11-13 22:34:01 -08:00
haoyangli-amd
0b25498990
[Misc] add ignore mapper for quark quantization (#28275)
Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>
2025-11-14 05:56:35 +00:00
Roger Wang
0aecd9138f
[Misc] Update xformers to 0.33.0.post1 (#28678)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-13 21:52:53 -08:00
Kunshang Ji
da14ae0fad
[XPU][CI]disable lm cache uts (#28696)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-11-14 03:15:50 +00:00
Cyrus Leung
01bea115c4
[Misc] Remove warn_for_unimplemented_methods (#28613)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-14 11:10:10 +08:00
Bradley D
b39a5026eb
[ci][amd] fix basic models extra init test (#28676)
Signed-off-by: Bradley Davis <bradleyhd@meta.com>
2025-11-14 02:44:36 +00:00
Michael Goin
622e6106a9
[CPU][Bugfix] Fix Apple Silicon M1 compilation failure (#28681)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-14 09:49:55 +08:00
Sage Moore
2aa75c752b
[ROCm] Bump up the version of amd-smi to 6.4.3 (#28680)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-11-14 01:24:28 +00:00
Hank_
4d5943bda6
[quantization][config] enable override existing quant_config (#28510)
Signed-off-by: Hank <hcc.mayday@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-11-14 01:24:10 +00:00
Alexei-V-Ivanov-AMD
f2b8e1c551
Mirrored test group definitions for AMD (2025-11-11) (#28573)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
2025-11-14 00:16:34 +00:00
Mark McLoughlin
6e25b1cddf
[KV Connector] Test async mode in scheduler tests (#28550)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
2025-11-13 18:30:59 -05:00
Wentao Ye
e64011f29a
[CI] Bug: Fix ci entrypoint pooling (#28684)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-11-13 14:19:35 -08:00
Simon Mo
1b622deba7
[Misc] Update CODEOWNERS for simon-mo and comaniac (#28675)
Signed-off-by: Simon Mo <simon.mo@hey.com>
2025-11-13 21:01:43 +00:00
Kebe
faed7bf07e
[Bugfix] [CPU] bump torch to 2.9.0 for Darwin to fix segmentation fault (#27791)
Signed-off-by: Kebe <mail@kebe7jun.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-11-13 12:48:08 -08:00
Yanan Cao
262d263f6c
[Bugfix] Eliminate tuple inputs to submodules in graph partitioning (#28533)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
2025-11-13 15:09:05 -05:00
Qiu
968060c15a
[bugfix] correct local_chunk_len for DCP in reorg_kvcache with long context (#28526)
Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-13 11:29:22 -08:00
elvischenv
5d6ce2b960
[Perf] Support stream interval for reducing host overhead (#27869)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-11-13 13:21:25 -05:00
Matthew Bonanni
f9f3b596f3
[Attention][Bugfix] Fix FA sink support (#28660)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-11-13 13:20:01 -05:00
Yannick Schnider
119c4927b3
[Bugfix] Fix validate model input for decoder models (#27099)
Signed-off-by: Yannick Schnider <yannick.schnider1@ibm.com>
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-13 10:18:47 -08:00
Varun Sundar Rabindranath
fe1cd7704d
[Performance][B200] silu_mul_quant: pack scales in int32 (#28358)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-11-13 10:16:55 -08:00
Johnny Yang
fdfd5075aa
[TPU] patch TPU wheel build script to resolve metadata issue (#27279)
Signed-off-by: Johnny Yang <johnnyyang@google.com>
2025-11-13 09:36:54 -08:00
Nick Hill
327c0a9a23
[BugFix] Ensure EngineArgs.create_engine_config is idempotent (#28515)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-11-13 17:14:08 +00:00
Jane (Yuan) Xu
06c4873d95
Rewrite C++ meta funcs to Python (#28595)
Signed-off-by: Jane Xu <janeyx@meta.com>
2025-11-14 00:52:50 +08:00
Roger Wang
d3387750f1
[Misc] Turn off encoder torch compile by default (#28634)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-13 08:38:08 -08:00
Harry Mellor
b230286fbc
Fix get_num_experts when config sets it explicitly to None (#28652)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: bruceszchen <bruceszchen@tencent.com>
2025-11-13 16:02:42 +00:00
Yuanping Song
3035d1a166
[BugFix] DeepSeek-OCR: apply NoRepeatNGramLogitsProcessor to greedy path (#28617)
Signed-off-by: Yuanping Song <yuanping.song@outlook.com>
2025-11-13 15:24:35 +00:00
Huamin Li
07a606aa7e
[CI Failure] Fix backend selection for encoder-only models (#28534)
Signed-off-by: Huamin Li <3ericli@gmail.com>
2025-11-13 10:11:27 -05:00
amdfaa
a7791eac9d
[CI/Build] Install uv for AMD MI300: Language Models Tests (Hybrid) %N (#28142)
Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: zhewenli <zhewenli@meta.com>
2025-11-13 14:34:55 +00:00
Pleaplusone
8da2f28f53
[ROCm][BugFix]Fix get_cu_count in rocm_aiter_fa.py (#28618)
Signed-off-by: ganyi <ygan@amd.com>
2025-11-13 14:18:20 +00:00
Akash kaothalkar
86d15bfd8d
[Hardware][PowerPC] Fix fp16 compilation error for Power in cpu attention backend and bump oneDNN version (#28535)
Signed-off-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>
Co-authored-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>
2025-11-13 13:32:21 +00:00
Fanli Lin
c9fe6abe7c
[Bugfix] Fix FPS value type for Qwen2.5-Omni video processing (#28630)
Signed-off-by: Lin, Fanli <fanli.lin@intel.com>
2025-11-13 13:06:06 +00:00
zofia
c47b6c85ac
[XPU] add sym params to IPEXConfig (#28611)
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
2025-11-13 11:35:04 +00:00
baonudesifeizhai
c428e8d80b
Fix io processor pooling #28273 (#28484)
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
2025-11-13 11:34:14 +00:00
Zijing Liu
5e973209aa
[BugFix] Fix type error when assign a trition kernel tensor to a torch.nn.Parameter (#28603)
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
2025-11-13 11:30:04 +00:00
Di Wu
e63fd44560
Fix: Correctly filter special tokens in benchmark_prefix_caching (#28615)
Signed-off-by: Di Wu <dw2761@nyu.edu>
2025-11-13 10:57:44 +00:00
Yong Hoon Shin
11ac9ddd03
Support all interleaved layer types (#28485)
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
2025-11-13 08:57:20 +00:00
Chauncey
5c9ad138d5
[Frontend] supports interleaved thinking (#28531)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-11-13 16:14:13 +08:00
Jiangyun Zhu
fa183e9271
[Bugfix] fix kimi-linear crash (#28445)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
2025-11-13 07:59:58 +00:00
usberkeley
4ab34f6ef1
Add NUMA node validation for CPU thread binding (#28555)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
2025-11-13 07:03:52 +00:00
Huy Do
c33b87e777
Use official xformers-0.0.33 built for PT 2.9 (#28600)
Signed-off-by: Huy Do <huydhn@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-12 22:48:53 -08:00
tjandy98
4504e8029b
[Bugfix] Prevent crash on empty grammar string (#28210)
Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
2025-11-13 06:42:29 +00:00
Pleaplusone
ca00b1bfc6
[ROCm][BugFix] Remove the usage of device_info from aiter (#28383)
Signed-off-by: ganyi <ygan@amd.com>
2025-11-12 21:43:42 -08:00
Radu Salavat
d44fbbab0e
[build][cmake]: Bundle static ACL and torch libgomp for CPU extension builds (#28059)
Signed-off-by: Radu Salavat <radu.salavat@arm.com>
2025-11-13 05:43:08 +00:00
Lucia Fang
7e082bc14e
Support DeepEP for Kimi-k2-thinking through enabling gemm selection for compressed-tensor marlin wna16 (#28574)
Signed-off-by: Lu Fang <fanglu@fb.com>
2025-11-12 21:40:45 -08:00
Fanli Lin
dbbe0c756a
[XPU] Support Triton path for LoRA operations on XPU (#28511)
Signed-off-by: Fanli Lin <fanli.lin@intel.com>
2025-11-13 05:31:42 +00:00