Deboleina
02a4169193
[Tests] Tool call tests for openai/gpt-oss-20b ( #26237 )
...
Signed-off-by: Debolina Roy <debroy@redhat.com>
2025-12-05 19:03:29 -08:00
Micah Williamson
06579f9a82
[AMD][CI] Add ray[default] Dependency On ROCm To Pass v1/metrics/test_engine_logger_apis.py ( #30110 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-12-05 06:48:23 +00:00
Charlie Fu
7c9b2c8f81
[ROCm][CI] Add jiwer dependency for testing ( #30081 )
...
Signed-off-by: charlifu <charlifu@amd.com>
2025-12-05 03:34:51 +00:00
Noa Neria
6366c098d7
Validating Runai Model Streamer Integration with S3 Object Storage ( #29320 )
...
Signed-off-by: Noa Neria <noa@run.ai>
2025-12-04 18:04:43 +08:00
Jianwei Mao
80f8af4b2f
Fix error while downloading dependencies for CPU backend ( #29797 )
...
Signed-off-by: Jianwei Mao <maojianwei2016@126.com>
2025-12-04 06:04:44 +00:00
avigny
dd5d1ef780
[Bugfix] Mistral tool parser streaming update ( #19425 )
...
Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Signed-off-by: Chauncey <chaunceyjiang@gmail.com>
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Jeff Cook <jeff@jeffcook.io>
Co-authored-by: sfbemerk <benjaminmerkel@mail.de>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-12-03 17:45:31 +00:00
Andreas Karatzas
506ed87e87
[ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues ( #29909 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-03 10:36:49 +08:00
Andreas Karatzas
ea3370b428
[ROCm][Bugfix] Patch for the Multi-Modal Processor Test group ( #29702 )
...
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-11-29 01:31:44 +00:00
Li, Jiang
e2f56c309d
[CPU] Update torch 2.9.1 for CPU backend ( #29664 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-28 13:37:54 +00:00
HappyAmazonian
f8151b66fa
Revert "Supress verbose logs from model_hosting_container_standards (… ( #29335 )
...
Signed-off-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-28 05:29:05 -08:00
Cyrus Leung
b34e8775a3
Revert "[CPU]Update CPU PyTorch to 2.9.0 ( #29589 )" ( #29647 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 22:43:18 -08:00
scydas
35657bcd7a
[CPU]Update CPU PyTorch to 2.9.0 ( #29589 )
...
Signed-off-by: scyda <scyda@outlook.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2025-11-28 09:34:33 +08:00
Andrii Skliar
a5345bf49d
[BugFix] Fix plan API Mismatch when using latest FlashInfer ( #29426 )
...
Signed-off-by: Andrii Skliar <askliar@askliar-mlt.client.nvidia.com>
Co-authored-by: Andrii Skliar <askliar@askliar-mlt.client.nvidia.com>
2025-11-27 11:34:59 -08:00
Harry Mellor
e1f262337b
Update Transformers pin in CI to 4.57.3 ( #29418 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-27 08:42:14 -08:00
Johnny Yang
ba1fcd84a7
[TPU] add tpu_inference ( #27277 )
...
Signed-off-by: Johnny Yang <johnnyyang@google.com>
2025-11-26 14:46:36 -08:00
Ryan Rock
fe3a4f5b34
[CI/Build] Pin torchgeo dependency for AMD ( #29353 )
...
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-11-25 07:14:59 +00:00
Divakar Verma
22b42b5402
[CI][ROCm] Install arctic-inference on ROCm tests ( #29344 )
...
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-11-25 02:15:39 +00:00
Kunshang Ji
b8328b49fb
[XPU] upgrade torch & ipex 2.9 on XPU platform ( #29307 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-11-25 09:34:47 +08:00
Nicolò Lucchesi
26a465584a
[NIXL] Use config to enable telemetry + NIXL version bump ( #29305 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-11-24 17:18:04 +00:00
Roger Wang
0ff70821c9
[Core] Deprecate xformers ( #29262 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Micah Williamson
55c21c8836
[ROCm][CI] Fix "Cannot re-initialize CUDA in forked subprocess" in test_pynccl.py ( #29119 )
...
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-11-23 13:05:00 +08:00
Ryan Rock
ed8e6843cc
[CI/Build] Add terratorch for AMD ( #29205 )
...
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
2025-11-21 17:31:22 -08:00
Bhagyashri
2b1b3dfa4b
Update Dockerfile to use gcc-toolset-14 and fix test case failures on power (ppc64le) ( #28957 )
...
Signed-off-by: Bhagyashri <Bhagyashri.Gaikwad2@ibm.com>
2025-11-21 12:24:09 +00:00
TJian
82b05b15e6
[BugFix] [FEAT] Enable fastsafetensors for ROCm platform ( #28225 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-11-20 16:34:11 +00:00
cjackal
66483a9d00
[Chore] Update xgrammar version from 0.1.25 to 0.1.27 ( #28221 )
...
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2025-11-20 02:53:09 -08:00
Roman Solomatin
71d0ae1c54
[Misc] Update embedding/cross encoder tests to use mteb v2 ( #27329 )
...
Signed-off-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
Signed-off-by: wang.yuqi <noooop@126.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: wang.yuqi <noooop@126.com>
Co-authored-by: wang.yuqi <yuqi.wang@daocloud.io>
2025-11-18 22:28:40 -08:00
Li, Jiang
20852c8f4c
[CPU] Refactor CPU WNA16 ( #28826 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-19 10:32:00 +08:00
Luciano Martins
c2612371ad
[Model] Add Gemma3 GGUF multimodal support ( #27772 )
...
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-11-18 08:56:29 -08:00
Michael Goin
88ab591f0b
Run macos smoke test workflow on main commit ( #28752 )
...
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-11-18 11:16:03 +08:00
Julien Denize
085424808e
Remove audio optional dependency for mistral-common ( #28722 )
...
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Signed-off-by: Julien Denize <40604584+juliendenize@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-11-14 09:54:38 -08:00
Nicolò Lucchesi
6f1e7f7226
[DisaggEverything] Tokens in<>out /generate endpoint ( #24261 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 09:58:01 -07:00
Roger Wang
0aecd9138f
[Misc] Update xformers to 0.33.0.post1 ( #28678 )
...
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-13 21:52:53 -08:00
Sage Moore
2aa75c752b
[ROCm] Bump up the version of amd-smi to 6.4.3 ( #28680 )
...
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-11-14 01:24:28 +00:00
Kebe
faed7bf07e
[Bugfix] [CPU] bump torch to 2.9.0 for Darwin to fix segmentation fault ( #27791 )
...
Signed-off-by: Kebe <mail@kebe7jun.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-11-13 12:48:08 -08:00
Huy Do
c33b87e777
Use official xformers-0.0.33 built for PT 2.9 ( #28600 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-11-12 22:48:53 -08:00
Gregory Shtrasberg
d75ad04818
[ROCm][Bugfix] Revert removing setuptools version restriction ( #28592 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-11-12 16:46:58 -08:00
Zuyi Zhao
bca74e32b7
[Frontend] Add sagemaker_standards dynamic lora adapter and stateful session management decorators to vLLM OpenAI API server ( #27892 )
...
Signed-off-by: Zuyi Zhao <zhaozuy@amazon.com>
Signed-off-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Shen Teng <sheteng@amazon.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
2025-11-11 04:57:01 +00:00
yihong
3a7d580343
fix: close issue 28338 by fixed python version ( #28339 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-11-09 05:07:26 +00:00
Cole Murray
32787d0644
Remove setuptools upper bound constraint (<80) ( #28337 )
...
Signed-off-by: Cole Murray <colemurray.cs@gmail.com>
2025-11-08 22:30:18 +00:00
Aurick Qiao
781f5ebf52
Bump arctic-inference requirement ( #28174 )
...
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-11-07 18:31:18 -08:00
Harry Mellor
811df41ee9
Update Flashinfer from v0.4.1 to v0.5.2 ( #27952 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-07 16:24:42 -08:00
Andy Lo
5e0c1fe69c
[Structured outputs] Upgrade llguidance to 1.3.0 ( #28039 )
...
Signed-off-by: Andy Lo <andy@mistral.ai>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2025-11-06 10:24:47 -08:00
R3hankhan
e04492449e
[Hardware][IBM Z] Optimize s390x Dockerfile ( #28023 )
...
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2025-11-05 11:25:44 -08:00
Zhewen Li
2f84ae1f27
[CI/Build] Update LM Eval Version in AMD CI ( #27944 )
...
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-11-04 06:36:40 +00:00
Aurick Qiao
2c19d96777
[Spec Decode] Integrate Suffix Decoding from Arctic Inference ( #25784 )
...
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
2025-11-03 09:23:31 -08:00
Harry Mellor
799ce45cc1
[Docs] Mock all imports for docs ( #27873 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-01 10:02:23 +00:00
Cyrus Leung
879a06579e
[CI/Build] Bump transformers version ( #27528 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-31 22:11:07 -07:00
Huy Do
ba33e8830d
Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" ( #27768 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-10-30 10:22:30 -07:00
Benjamin Bartels
17d055f527
[Feat] Adds runai distributed streamer ( #27230 )
...
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: omer-dayan <omdayan@nvidia.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-29 21:09:10 -07:00
Kunshang Ji
b5bae42f91
[XPU] Update latest IPEX 2.8 release ( #27735 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-10-30 11:17:13 +08:00