163 Commits

Author SHA1 Message Date
Peter Pan
1eb2b9c102
[CI] update typos config for CI pre-commit and fix some spells (#20919)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-15 21:12:40 -07:00
Doug Smith
7976446015
Add Dockerfile argument for VLLM_USE_PRECOMPILED environment (#20943)
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-15 19:53:57 -07:00
TJian
80d38b8ac8
[V1] [ROCm] [AITER] Upgrade AITER to commit 916bf3c and bugfix APIs (#20880)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-07-13 15:19:32 +00:00
Michael Goin
cf75cd2098
[CI Bugfix] Specify same TORCH_CUDA_ARCH_LIST for flashinfer aot and install (#20772)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-11 01:16:01 +00:00
Michael Goin
4b9a9435bb
Update Dockerfile FlashInfer to v0.2.8rc1 (#20718)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-10 08:09:02 -07:00
Michael Goin
b7d9e9416f
[CI/Build] Fix FlashInfer double build in Dockerfile (#20651)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-09 17:41:56 -06:00
Liangliang Ma
2c5ebec064
[XPU][CI] add v1/core test in xpu hardware ci (#20537)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
2025-07-07 01:16:40 -07:00
Peter Pan
5561681d04
[CI] add kvcache-connector dependency definition and add into CI build (#18193)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-04 06:49:18 -07:00
Jee Jee Li
1819fbda63
[Quantization] Bump to use latest bitsandbytes (#20424)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-03 21:58:46 +08:00
Li, Jiang
7f0367109e
[CI/Build][CPU] Enable cross compilation in CPU release pipeline (#20423)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-03 05:26:12 -07:00
Louie Tsai
9965c47d0d
Enable CPU nightly performance benchmark and its Markdown report (#18444)
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
2025-07-02 17:50:25 -07:00
Tyler Michael Smith
bdb84e26b0
[Bugfix] Fixes for FlashInfer's TORCH_CUDA_ARCH_LIST (#20136)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
2025-07-02 17:15:11 -07:00
Fabien Dupont
3c545c0c3b
[CI/Build] Allow hermetic builds (#18064)
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Fabien Dupont <fabiendupont@pm.me>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Elias Levy <eliaslevy@google.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-06-27 09:04:39 -07:00
Kunshang Ji
b69781f107
[Hardware][Intel GPU] Add v1 Intel GPU support with Flash attention backend. (#19560)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-06-26 09:27:18 -07:00
Li, Jiang
0567c8249f
[CPU] Fix torch version in x86 CPU backend (#19258)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-06-26 03:34:47 -07:00
Michael Goin
296ce95d8e
[CI] Add SM120 to the Dockerfile (#19794)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-25 16:23:56 -07:00
TJian
2de12be428
[ROCm] [AITER] [Bugfix] Patch for AITER commit 648764942e552a8bb5fe16026703716a81f05374 (#18990)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-06-18 22:56:31 -07:00
Michael Goin
497a91e9f7
[CI] Update FlashInfer to 0.2.6.post1 (#19297)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-11 22:57:28 +08:00
niu_he
a45b979d9f
[BugFix] Fix docker build cpu-dev image error (#19394)
Signed-off-by: niu_he <carlton2tang@gmail.com>
2025-06-10 20:56:40 -07:00
pramenku
88be823d57
[AMD] Update compatible packaging version (#19309)
Signed-off-by: pramkuma <Pramendra.Kumar@amd.com>
2025-06-07 20:55:09 +08:00
Alexei-V-Ivanov-AMD
e010688f50
[Build][ROCm] Update Dockerfile.rocm (#19296)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
2025-06-06 19:35:16 -04:00
Nishidha
94ecee6282
Fixed ppc build when it runs on non-RHEL based linux distros (#18422)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
Signed-off-by: Md. Shafi Hussain <Md.Shafi.Hussain@ibm.com>
Signed-off-by: npanpaliya <nishidha.panpaliya@partner.ibm.com>
Co-authored-by: Md. Shafi Hussain <Md.Shafi.Hussain@ibm.com>
2025-06-06 11:54:26 -07:00
Yang Wang
25b918eee6
[Torch Nightly]add missing dependency (#18770)
Signed-off-by: Yang Wang <elainewy@meta.com>
2025-06-04 21:56:12 -07:00
Lu Fang
d32aa2e670
[Bugfix] Use cmake 3.26.1 instead of 3.26 to avoid build failure (#19019)
Signed-off-by: Lu Fang <lufang@fb.com>
2025-06-03 00:16:17 -07:00
Gregory Shtrasberg
9e6f61e8c3
[ROCm][Build] Clean up the ROCm build (#19040)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-06-02 20:47:47 -07:00
Li, Jiang
8655f47f37
[CPU][CI] Re-enable the CPU CI tests (#19046)
Signed-off-by: jiang.li <jiang1.li@intel.com>
2025-06-02 20:46:47 -07:00
Cyrus Leung
7d9216495c
[Doc] Update references to doc files (#18637)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-23 15:49:21 -07:00
Huy Do
1645b60196
Use prebuilt FlashInfer x86_64 PyTorch 2.7 CUDA 12.8 wheel for CI (#18537)
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-05-23 21:17:16 +00:00
Kay Yan
7ab056c273
[Hardware][CPU] Update intel_extension_for_pytorch 2.7.0 and move to requirements/cpu.txt (#18542)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-05-23 04:38:42 -07:00
Harry Mellor
a1fe24d961
Migrate docs from Sphinx to MkDocs (#18145)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-23 02:09:53 -07:00
Tyler Michael Smith
6e588da0f4
[Build/CI] Fix CUDA 11.8 build (#17679)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-22 12:13:54 -07:00
Satyajith Chilappagari
e2d7d31244
[Neuron] Update Dockerfile.neuron to use latest neuron release (2.23) (#18512)
Signed-off-by: Satyajith Chilappagari <satchill@amazon.com>
2025-05-22 02:17:34 -07:00
vllmellm
94d8ec8d2b
[FEAT][ROCm] Upgrade AITER MLA v1 backend (#18338)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-05-21 10:34:28 -07:00
Kebe
371376f996
[Build] fix Dockerfile shell (#18402) 2025-05-21 07:32:06 -07:00
Dilip Gowda Bhagavan
23baa2180b
fix:Build torch wheel inline rather than picking from nightly (#18351)
Signed-off-by: Dilip Gowda Bhagavan <dilip.bhagavan@ibm.com>
2025-05-20 22:22:24 +00:00
Simon Mo
47fda6d089
[Build] Supports CUDA 12.6 and 11.8 after Blackwell Update (#18316)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-05-18 23:19:33 -07:00
Nishidha
f880d42582
Fixed build on ppc64le due to openssl conflicts (#18262)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
2025-05-17 00:23:46 -07:00
Michael Goin
dcfe95234c
Update Dockerfile to build for Blackwell (#18095) 2025-05-17 00:23:25 -07:00
Bowen Wang
7fdfa01530
[Sampler] Adapt to FlashInfer 0.2.3 sampler API (#15777)
Signed-off-by: Bowen Wang <abmfy@icloud.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-05-16 15:14:03 -07:00
vllmellm
3c9396a64f
[FEAT][ROCm]: Support AITER MLA on V1 Engine (#17523)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Co-authored-by: qli88 <qiang.li2@amd.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>
2025-05-09 10:42:05 +08:00
Yang Wang
6de3e13413
Add logging for torch nightly version (#17669)
Signed-off-by: Yang Wang <elainewy@meta.com>
2025-05-07 00:45:51 +00:00
Hongxia Yang
90d0a54c4d
[ROCm] Effort to reduce the number of environment variables in command line (#17229)
Signed-off-by: Hongxia Yang <hongxia.yang@amd.com>
2025-04-30 23:27:06 -07:00
Alexei-V-Ivanov-AMD
7ab643e425
FIxing the AMD test failures caused by PR#16457 (#17511)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
2025-04-30 23:23:07 -07:00
Kunshang Ji
ed6cfb90c8
[Hardware][Intel GPU] Upgrade to torch 2.7 (#17444)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Qiming Zhang <qiming1.zhang@intel.com>
2025-04-30 00:03:58 -07:00
Huy Do
2c4f59afc3
Update PyTorch to 2.7.0 (#16859) 2025-04-29 19:08:04 -07:00
Dilip Gowda Bhagavan
c9c1b59e59
Fix: Python package installation for opentelmetry (#17049)
Signed-off-by: Dilip Gowda Bhagavan <dilip.bhagavan@ibm.com>
2025-04-29 20:20:24 +00:00
Reid
08e15defa9
[CI/Build] Add retry mechanism for add-apt-repository (#17107)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-04-29 10:40:52 -07:00
Nicolò Lucchesi
792595b59d
[TPU][V1][CI] Replace python3 setup.py develop with standard pip install --e on TPU (#17374)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-04-29 10:36:48 -07:00
Lennart K. M. Schulz
d1aeea7553
[Bugfix] Fix missing ARG in Dockerfile for arm64 platforms (#17261)
Signed-off-by: lkm-schulz <44176356+lkm-schulz@users.noreply.github.com>
2025-04-27 19:38:14 -07:00
Sangyeon Cho
b07d741661
[CI/Build] workaround for CI build failure (#17070)
Signed-off-by: csy1204 <josang1204@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-04-23 16:14:18 -07:00