Nicolò Lucchesi
a66d131381
[TPU][Bugfix][CI] Fix broken tests/build dependency ( #25255 )
...
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-09-22 09:55:04 +00:00
Woosuk Kwon
52c2a8d4ad
[V0 Deprecation] Remove LLMEngine ( #25033 )
...
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-20 17:56:30 -07:00
Aaron Pham
29283e8976
[Chore] Cleanup guided namespace, move to structured outputs config ( #22772 )
...
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-18 09:20:27 +00:00
Woosuk Kwon
759ef49b15
Remove V0 Encoder-Decoder Support ( #24907 )
...
Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai>
2025-09-15 21:17:14 -07:00
Simon Mo
fd2f10546c
[ci] fix wheel names for arm wheels ( #24898 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-09-15 14:39:08 -07:00
Fanli Lin
0cd72a7b72
[XPU] add missing dependency tblib for XPU CI ( #24639 )
...
Signed-off-by: Fanli Lin <fanli.lin@intel.com>
2025-09-11 11:22:33 +00:00
Sahithi Chigurupati
6910b56da2
[CI] Add nightly multiarch manifests to dockerhub ( #24102 )
...
Signed-off-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
Signed-off-by: Simon Mo <simon.mo@hey.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-09-09 01:18:09 +00:00
Woosuk Kwon
4172235ab7
[V0 deprecation] Deprecate V0 Neuron backend ( #21159 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-06 16:15:18 -07:00
Kunshang Ji
16ded21eeb
[XPU] support Triton Attention backend on Intel GPU ( #24149 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-09-04 20:41:08 +08:00
youkaichao
42dc59dbac
Update release pipeline post PyTorch 2.8.0 update ( #24073 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Huy Do <huydhn@gmail.com>
2025-09-03 10:09:19 +08:00
Li, Jiang
ad39106b16
[CPU] Enable data parallel for CPU backend ( #23903 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-08-29 02:19:58 -07:00
Jee Jee Li
b4f9e9631c
[CI/Build] Clean up LoRA test ( #23890 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-08-28 23:28:35 -07:00
Li, Jiang
67cee40da0
[CI/Build][Bugfix] Fix Qwen VL tests on CPU ( #23818 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-08-28 11:57:05 +00:00
Kunshang Ji
fce10dbed5
[XPU] Add xpu torch.compile support ( #22609 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-08-27 05:33:27 +00:00
Pate Motter
c34c82b7fe
[TPU][Bugfix] Fixes prompt_token_ids error in tpu tests. ( #23574 )
...
Signed-off-by: Pate Motter <patemotter@google.com>
2025-08-25 14:29:16 -07:00
youkaichao
e0b056e443
[ci/build] Fix abi tag for aarch64 ( #23329 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-08-21 23:32:55 +08:00
QiliangCui
8993073dc1
[CI] Delete images older than 24h. ( #23291 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-08-20 21:15:20 -07:00
Li, Jiang
7be5d113d8
[CPU] Refactor CPU W8A8 scaled_mm ( #23071 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-08-21 09:34:24 +08:00
Kunshang Ji
5c79b0d648
[XPU][CI]add xpu env vars in CI scripts ( #22946 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-08-18 09:47:03 +00:00
Michael Goin
4fc722eca4
[Kernel/Quant] Remove AQLM ( #22943 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-08-16 19:38:21 +00:00
Harry Mellor
839ab00349
Re-enable Xet on TPU tests now that hf_xet has been updated ( #22666 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-11 19:54:40 -07:00
Kyuyeun Kim
9a0c5ded5a
[TPU] Add support for online w8a8 quantization ( #22425 )
...
Signed-off-by: Kyuyeun Kim <kyuyeunk@google.com>
2025-08-08 23:12:54 -07:00
Siyuan Liu
4b29d2784b
[CI][TPU] Fix docker clean up ( #22271 )
...
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
2025-08-05 23:54:56 +00:00
Roger Wang
067c34a155
docs: remove deprecated disable-log-requests flag ( #22113 )
...
Signed-off-by: Roger Wang <hey@rogerw.me>
2025-08-02 00:19:48 -07:00
Daniele
d2aab336ad
[CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES ( #21599 )
...
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2025-07-31 15:00:08 +08:00
Li, Jiang
65e8466c37
[Bugfix] Fix environment variable setting in CPU Dockerfile ( #21730 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-28 11:02:39 +00:00
Ye (Charlotte) Qi
e7c4f9ee86
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI ( #21355 )
...
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
2025-07-26 07:10:14 -07:00
QiliangCui
7728dd77bb
[TPU][Test] Divide TPU v1 Test into 2 parts. ( #21431 )
2025-07-26 06:20:30 +00:00
QiliangCui
07d80d7b0e
[TPU][TEST] HF_HUB_DISABLE_XET=1 the test 3. ( #21539 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-07-24 15:33:04 -07:00
Liangliang Ma
13e4ee1dc3
[XPU][UT] increase intel xpu CI test scope ( #21492 )
...
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
2025-07-23 20:24:04 -07:00
QiliangCui
14bf19e39f
[TPU][TEST] Fix the downloading issue in TPU v1 test 11. ( #21418 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-07-23 11:29:36 -07:00
Li, Jiang
a15a50fc17
[CPU] Enable shared-memory based pipeline parallel for CPU backend ( #21289 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-21 09:07:08 -07:00
Woosuk Kwon
752c6ade2e
[V0 Deprecation] Deprecate BlockSparse Attention & Phi3-Small ( #21217 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-19 13:53:17 -07:00
Li, Jiang
e3a0e43d7f
[bugfix] Fix auto thread-binding when world_size > 1 in CPU backend and refactor code ( #21032 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-19 05:13:55 -07:00
XiongfeiWei
58760e12b1
[TPU] Start using python 3.12 ( #21000 )
...
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
2025-07-16 19:37:44 -07:00
Chendi.Xue
e9534c7202
[CI][HPU] update for v0 deprecate by switching to VLLM_TARGET_DEVICE=empty ( #21006 )
...
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
2025-07-15 20:07:05 -07:00
QiliangCui
c66e38ea4c
[Test] Remove docker build from test. ( #20542 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-07-10 11:21:58 -07:00
Kunshang Ji
b6e7e3d58f
[Intel GPU] support ray as distributed executor backend for XPU. ( #20659 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-07-09 00:36:58 -07:00
Li, Jiang
7721ef1786
[CI/Build][CPU] Fix CPU CI and remove all CPU V0 files ( #20560 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-07 22:13:44 -07:00
Liangliang Ma
2c5ebec064
[XPU][CI] add v1/core test in xpu hardware ci ( #20537 )
...
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
2025-07-07 01:16:40 -07:00
Cyrus Leung
9fb52e523a
[V1] Support any head size for FlexAttention backend ( #20467 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-07-06 09:54:36 -07:00
Woosuk Kwon
e202dd2736
[V0 deprecation] Remove V0 CPU/XPU/TPU backends ( #20412 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2025-07-06 08:48:13 -07:00
QiliangCui
4ff61ababa
[TPU] Add a case to cover RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 ( #20385 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-07-03 06:46:41 +00:00
Li, Jiang
6cc1e7d96d
[CPU] Update custom ops for the CPU backend ( #20255 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-07-01 07:25:03 +00:00
Chendi.Xue
a2f14dc8f9
[CI][Intel Gaudi][vllm-Plugin]Add CI for hpu-plugin-v1-test ( #20196 )
...
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-07-01 04:17:07 +00:00
Chengji Yao
04e1642e32
[TPU] add kv cache update kernel ( #19928 )
...
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-06-26 10:01:37 -07:00
Kunshang Ji
b69781f107
[Hardware][Intel GPU] Add v1 Intel GPU support with Flash attention backend. ( #19560 )
...
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-06-26 09:27:18 -07:00
QiliangCui
4e0db57fff
Fix the path to the testing script. ( #20082 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-06-25 20:48:17 +00:00
QiliangCui
a738dbb2a1
Update test case parameter to have the throughput above 8.0 ( #19994 )
...
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
2025-06-24 00:18:10 +00:00
Elaine Zhao
b6bad3d186
[CI][Neuron] Fail and exit on first error ( #19622 )
...
Signed-off-by: Elaine Zhao <elaineyz@amazon.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-20 12:27:51 +08:00