Li, Jiang
|
67cee40da0
|
[CI/Build][Bugfix] Fix Qwen VL tests on CPU (#23818)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-08-28 11:57:05 +00:00 |
|
Kunshang Ji
|
fce10dbed5
|
[XPU] Add xpu torch.compile support (#22609)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-08-27 05:33:27 +00:00 |
|
Pate Motter
|
c34c82b7fe
|
[TPU][Bugfix] Fixes prompt_token_ids error in tpu tests. (#23574)
Signed-off-by: Pate Motter <patemotter@google.com>
|
2025-08-25 14:29:16 -07:00 |
|
youkaichao
|
e0b056e443
|
[ci/build] Fix abi tag for aarch64 (#23329)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-08-21 23:32:55 +08:00 |
|
QiliangCui
|
8993073dc1
|
[CI] Delete images older than 24h. (#23291)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-08-20 21:15:20 -07:00 |
|
Li, Jiang
|
7be5d113d8
|
[CPU] Refactor CPU W8A8 scaled_mm (#23071)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-08-21 09:34:24 +08:00 |
|
Kunshang Ji
|
5c79b0d648
|
[XPU][CI]add xpu env vars in CI scripts (#22946)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-08-18 09:47:03 +00:00 |
|
Michael Goin
|
4fc722eca4
|
[Kernel/Quant] Remove AQLM (#22943)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2025-08-16 19:38:21 +00:00 |
|
Harry Mellor
|
839ab00349
|
Re-enable Xet on TPU tests now that hf_xet has been updated (#22666)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-08-11 19:54:40 -07:00 |
|
Kyuyeun Kim
|
9a0c5ded5a
|
[TPU] Add support for online w8a8 quantization (#22425)
Signed-off-by: Kyuyeun Kim <kyuyeunk@google.com>
|
2025-08-08 23:12:54 -07:00 |
|
Siyuan Liu
|
4b29d2784b
|
[CI][TPU] Fix docker clean up (#22271)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
|
2025-08-05 23:54:56 +00:00 |
|
Roger Wang
|
067c34a155
|
docs: remove deprecated disable-log-requests flag (#22113)
Signed-off-by: Roger Wang <hey@rogerw.me>
|
2025-08-02 00:19:48 -07:00 |
|
Daniele
|
d2aab336ad
|
[CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES (#21599)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
|
2025-07-31 15:00:08 +08:00 |
|
Li, Jiang
|
65e8466c37
|
[Bugfix] Fix environment variable setting in CPU Dockerfile (#21730)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-28 11:02:39 +00:00 |
|
Ye (Charlotte) Qi
|
e7c4f9ee86
|
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI (#21355)
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
|
2025-07-26 07:10:14 -07:00 |
|
QiliangCui
|
7728dd77bb
|
[TPU][Test] Divide TPU v1 Test into 2 parts. (#21431)
|
2025-07-26 06:20:30 +00:00 |
|
QiliangCui
|
07d80d7b0e
|
[TPU][TEST] HF_HUB_DISABLE_XET=1 the test 3. (#21539)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-07-24 15:33:04 -07:00 |
|
Liangliang Ma
|
13e4ee1dc3
|
[XPU][UT] increase intel xpu CI test scope (#21492)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
|
2025-07-23 20:24:04 -07:00 |
|
QiliangCui
|
14bf19e39f
|
[TPU][TEST] Fix the downloading issue in TPU v1 test 11. (#21418)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-07-23 11:29:36 -07:00 |
|
Li, Jiang
|
a15a50fc17
|
[CPU] Enable shared-memory based pipeline parallel for CPU backend (#21289)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-21 09:07:08 -07:00 |
|
Woosuk Kwon
|
752c6ade2e
|
[V0 Deprecation] Deprecate BlockSparse Attention & Phi3-Small (#21217)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-07-19 13:53:17 -07:00 |
|
Li, Jiang
|
e3a0e43d7f
|
[bugfix] Fix auto thread-binding when world_size > 1 in CPU backend and refactor code (#21032)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-19 05:13:55 -07:00 |
|
XiongfeiWei
|
58760e12b1
|
[TPU] Start using python 3.12 (#21000)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
|
2025-07-16 19:37:44 -07:00 |
|
Chendi.Xue
|
e9534c7202
|
[CI][HPU] update for v0 deprecate by switching to VLLM_TARGET_DEVICE=empty (#21006)
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
|
2025-07-15 20:07:05 -07:00 |
|
QiliangCui
|
c66e38ea4c
|
[Test] Remove docker build from test. (#20542)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-07-10 11:21:58 -07:00 |
|
Kunshang Ji
|
b6e7e3d58f
|
[Intel GPU] support ray as distributed executor backend for XPU. (#20659)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-07-09 00:36:58 -07:00 |
|
Li, Jiang
|
7721ef1786
|
[CI/Build][CPU] Fix CPU CI and remove all CPU V0 files (#20560)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-07 22:13:44 -07:00 |
|
Liangliang Ma
|
2c5ebec064
|
[XPU][CI] add v1/core test in xpu hardware ci (#20537)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
|
2025-07-07 01:16:40 -07:00 |
|
Cyrus Leung
|
9fb52e523a
|
[V1] Support any head size for FlexAttention backend (#20467)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-07-06 09:54:36 -07:00 |
|
Woosuk Kwon
|
e202dd2736
|
[V0 deprecation] Remove V0 CPU/XPU/TPU backends (#20412)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
|
2025-07-06 08:48:13 -07:00 |
|
QiliangCui
|
4ff61ababa
|
[TPU] Add a case to cover RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 (#20385)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-07-03 06:46:41 +00:00 |
|
Li, Jiang
|
6cc1e7d96d
|
[CPU] Update custom ops for the CPU backend (#20255)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-07-01 07:25:03 +00:00 |
|
Chendi.Xue
|
a2f14dc8f9
|
[CI][Intel Gaudi][vllm-Plugin]Add CI for hpu-plugin-v1-test (#20196)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-07-01 04:17:07 +00:00 |
|
Chengji Yao
|
04e1642e32
|
[TPU] add kv cache update kernel (#19928)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
|
2025-06-26 10:01:37 -07:00 |
|
Kunshang Ji
|
b69781f107
|
[Hardware][Intel GPU] Add v1 Intel GPU support with Flash attention backend. (#19560)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-06-26 09:27:18 -07:00 |
|
QiliangCui
|
4e0db57fff
|
Fix the path to the testing script. (#20082)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-06-25 20:48:17 +00:00 |
|
QiliangCui
|
a738dbb2a1
|
Update test case parameter to have the throughput above 8.0 (#19994)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-06-24 00:18:10 +00:00 |
|
Elaine Zhao
|
b6bad3d186
|
[CI][Neuron] Fail and exit on first error (#19622)
Signed-off-by: Elaine Zhao <elaineyz@amazon.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-06-20 12:27:51 +08:00 |
|
Li, Jiang
|
6458721108
|
[CPU] Refine default config for the CPU backend (#19539)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-06-13 13:27:39 +08:00 |
|
Li, Jiang
|
e4248849ec
|
[BugFix][CPU] Fix CPU CI by ignore collecting test_pixtral (#19411)
Signed-off-by: jiang.li <jiang1.li@intel.com>
|
2025-06-10 12:02:40 +00:00 |
|
Reid
|
12e5829221
|
[doc] improve ci doc (#19307)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-06-09 07:26:12 +00:00 |
|
Aaruni Aggarwal
|
c4296b1a27
|
[CI][PowerPC] Use a more appropriate way to select testcase in tests/models/language/pooling/test_embedding.py (#19253)
Signed-off-by: Aaruni Aggarwal <aaruniagg@gmail.com>
|
2025-06-07 11:52:52 +08:00 |
|
QiliangCui
|
66c508b137
|
[TPU][Test] Add script to run benchmark on TPU for buildkite (#19039)
Signed-off-by: Qiliang Cui <derrhein@gmail.com>
|
2025-06-06 20:10:24 -07:00 |
|
Nishidha
|
94ecee6282
|
Fixed ppc build when it runs on non-RHEL based linux distros (#18422)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
Signed-off-by: Md. Shafi Hussain <Md.Shafi.Hussain@ibm.com>
Signed-off-by: npanpaliya <nishidha.panpaliya@partner.ibm.com>
Co-authored-by: Md. Shafi Hussain <Md.Shafi.Hussain@ibm.com>
|
2025-06-06 11:54:26 -07:00 |
|
Simon Mo
|
da40380214
|
[Build] Annotate wheel and container path for release workflow (#19162)
Signed-off-by: simon-mo <simon.mo@hey.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-06-04 23:24:56 -07:00 |
|
Siyuan Liu
|
7ee2590478
|
[TPU] Update dynamo dump file name in compilation test (#19108)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
|
2025-06-04 16:13:43 -04:00 |
|
Siyuan Liu
|
8e972d9c44
|
[TPU] Skip hanging tests (#19115)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
|
2025-06-04 01:43:00 -07:00 |
|
Li, Jiang
|
4555143ea7
|
[CPU] V1 support for the CPU backend (#16441)
|
2025-06-03 18:43:01 -07:00 |
|
Li, Jiang
|
8655f47f37
|
[CPU][CI] Re-enable the CPU CI tests (#19046)
Signed-off-by: jiang.li <jiang1.li@intel.com>
|
2025-06-02 20:46:47 -07:00 |
|
Concurrensee
|
4ce42f9204
|
Adding "LoRA Test %N" to AMD production tests (#18929)
Signed-off-by: Yida Wu <yidawu@alumni.cmu.edu>
|
2025-06-02 20:46:44 -07:00 |
|