277 Commits

Author SHA1 Message Date
Ryan Rock
fe3a4f5b34
[CI/Build] Pin torchgeo dependency for AMD (#29353)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-11-25 07:14:59 +00:00
Divakar Verma
22b42b5402
[CI][ROCm] Install arctic-inference on ROCm tests (#29344)
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-11-25 02:15:39 +00:00
Kunshang Ji
b8328b49fb
[XPU] upgrade torch & ipex 2.9 on XPU platform (#29307)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-11-25 09:34:47 +08:00
Nicolò Lucchesi
26a465584a
[NIXL] Use config to enable telemetry + NIXL version bump (#29305)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-11-24 17:18:04 +00:00
Roger Wang
0ff70821c9
[Core] Deprecate xformers (#29262)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Micah Williamson
55c21c8836
[ROCm][CI] Fix "Cannot re-initialize CUDA in forked subprocess" in test_pynccl.py (#29119)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-11-23 13:05:00 +08:00
Ryan Rock
ed8e6843cc
[CI/Build] Add terratorch for AMD (#29205)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-11-21 17:31:22 -08:00
Bhagyashri
2b1b3dfa4b
Update Dockerfile to use gcc-toolset-14 and fix test case failures on power (ppc64le) (#28957)
Signed-off-by: Bhagyashri <Bhagyashri.Gaikwad2@ibm.com>
2025-11-21 12:24:09 +00:00
TJian
82b05b15e6
[BugFix] [FEAT] Enable fastsafetensors for ROCm platform (#28225)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-11-20 16:34:11 +00:00
cjackal
66483a9d00
[Chore] Update xgrammar version from 0.1.25 to 0.1.27 (#28221)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2025-11-20 02:53:09 -08:00
Roman Solomatin
71d0ae1c54
[Misc] Update embedding/cross encoder tests to use mteb v2 (#27329)
Signed-off-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: wang.yuqi <noooop@126.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-11-18 22:28:40 -08:00
Li, Jiang
20852c8f4c
[CPU] Refactor CPU WNA16 (#28826)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-19 10:32:00 +08:00
Luciano Martins
c2612371ad
[Model] Add Gemma3 GGUF multimodal support (#27772)
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-18 08:56:29 -08:00
Michael Goin
88ab591f0b
Run macos smoke test workflow on main commit (#28752)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-18 11:16:03 +08:00
Julien Denize
085424808e
Remove audio optional dependency for mistral-common (#28722)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-14 09:54:38 -08:00
Nicolò Lucchesi
6f1e7f7226
[DisaggEverything] Tokens in<>out /generate endpoint (#24261)
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 09:58:01 -07:00
Roger Wang
0aecd9138f
[Misc] Update xformers to 0.33.0.post1 (#28678)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-13 21:52:53 -08:00
Sage Moore
2aa75c752b
[ROCm] Bump up the version of amd-smi to 6.4.3 (#28680)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-11-14 01:24:28 +00:00
Kebe
faed7bf07e
[Bugfix] [CPU] bump torch to 2.9.0 for Darwin to fix segmentation fault (#27791)
Signed-off-by: Kebe <mail@kebe7jun.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-11-13 12:48:08 -08:00
Huy Do
c33b87e777
Use official xformers-0.0.33 built for PT 2.9 (#28600)
Signed-off-by: Huy Do <huydhn@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-12 22:48:53 -08:00
Gregory Shtrasberg
d75ad04818
[ROCm][Bugfix] Revert removing setuptools version restriction (#28592)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-11-12 16:46:58 -08:00
Zuyi Zhao
bca74e32b7
[Frontend] Add sagemaker_standards dynamic lora adapter and stateful session management decorators to vLLM OpenAI API server (#27892)
Signed-off-by: Zuyi Zhao <zhaozuy@amazon.com>
Signed-off-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
2025-11-11 04:57:01 +00:00
yihong
3a7d580343
fix: close issue 28338 by fixed python version (#28339)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-11-09 05:07:26 +00:00
Cole Murray
32787d0644
Remove setuptools upper bound constraint (<80) (#28337)
Signed-off-by: Cole Murray <colemurray.cs@gmail.com>
2025-11-08 22:30:18 +00:00
Aurick Qiao
781f5ebf52
Bump arctic-inference requirement (#28174)
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-11-07 18:31:18 -08:00
Harry Mellor
811df41ee9
Update Flashinfer from v0.4.1 to v0.5.2 (#27952)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-07 16:24:42 -08:00
Andy Lo
5e0c1fe69c
[Structured outputs] Upgrade llguidance to 1.3.0 (#28039)
Signed-off-by: Andy Lo <andy@mistral.ai>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-11-06 10:24:47 -08:00
R3hankhan
e04492449e
[Hardware][IBM Z] Optimize s390x Dockerfile (#28023)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2025-11-05 11:25:44 -08:00
Zhewen Li
2f84ae1f27
[CI/Build] Update LM Eval Version in AMD CI (#27944)
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-11-04 06:36:40 +00:00
Aurick Qiao
2c19d96777
[Spec Decode] Integrate Suffix Decoding from Arctic Inference (#25784)
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
2025-11-03 09:23:31 -08:00
Harry Mellor
799ce45cc1
[Docs] Mock all imports for docs (#27873)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-01 10:02:23 +00:00
Cyrus Leung
879a06579e
[CI/Build] Bump transformers version (#27528)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-31 22:11:07 -07:00
Huy Do
ba33e8830d
Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (#27768)
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-10-30 10:22:30 -07:00
Benjamin Bartels
17d055f527
[Feat] Adds runai distributed streamer (#27230)
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: omer-dayan <omdayan@nvidia.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-29 21:09:10 -07:00
Kunshang Ji
b5bae42f91
[XPU] Update latest IPEX 2.8 release (#27735)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-10-30 11:17:13 +08:00
Varun Sundar Rabindranath
f6d5f5888c
[Build] Revert triton_kernels requirements (#27659) 2025-10-28 21:07:09 -07:00
Simon Mo
9007bf57e6
Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (#27714) 2025-10-28 20:58:01 -07:00
Huy Do
f257544709
Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (#27598)
Signed-off-by: Huy Do <huydhn@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-28 19:39:15 -07:00
Cyrus Leung
6ebffafbb6
[Misc] Clean up more utils (#27567)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-27 15:30:38 +00:00
Varun Sundar Rabindranath
a9f55dc588
[Misc] Add triton_kernels dependency (#27370)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
2025-10-23 12:04:14 -07:00
RED
c9461e05a4
Support Anthropic API /v1/messages Endpoint (#22627)
Signed-off-by: liuli <ll407707@alibaba-inc.com>
Co-authored-by: liuli <ll407707@alibaba-inc.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-10-22 09:13:18 -07:00
Huy Do
becb7de40b
Update PyTorch to 2.9.0+cu129 (#24994)
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-10-21 17:20:18 -04:00
Fadi Arafeh
163965d183
[cpu] Dispatch un-quantized linear to oneDNN/ACL by default for AArch64 (#27183)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: Michael Yang <Michael.Yang@arm.com>
2025-10-21 02:02:58 +00:00
Concurrensee
58fbbcb2f5
[ROCm] enable some tests in entrypoints test groups on AMD (#26725)
Signed-off-by: Yida <yida.wu@amd.com>
2025-10-21 00:37:16 +00:00
jiahanc
41d3071918
[NVIDIA] [Perf] Update to leverage flashinfer trtllm FP4 MOE throughput kernel (#26714)
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-10-16 16:20:25 -07:00
Lukas Geiger
ed344f4116
Cleanup code after Python 3.10 upgrade (#26520)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
2025-10-16 03:38:23 -07:00
Zhewen Li
44c8555621
[CI/Build] Fix AMD import failures in CI (#26841)
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-10-16 07:28:20 +00:00
wangxiyuan
8f4b313c37
[Misc] rename torch_dtype to dtype (#26695)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-10-15 12:11:48 +00:00
youkaichao
8a0af6a561
[build][torch.compile] upgrade depyf version (#26702)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-10-14 10:12:09 +08:00
Sangyeon Cho
a1b2d658ee
[CI/Build] upgrade compressed-tensors to 0.12.2 to address LGPLv3 (#26501)
Signed-off-by: Sangyeon Cho <josang1204@gmail.com>
2025-10-13 12:58:33 -04:00