Michael Goin
|
aa4502e7f3
|
[CI][Bugfix] Fix failing V1 Test due to missing 'cache_salt' arg (#17500)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-30 21:03:30 -07:00 |
|
Michael Goin
|
17b4d85f63
|
[CI][TPU] Skip structured outputs+spec decode tests on TPU (#17510)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-30 20:36:20 -07:00 |
|
NaLan ZeYu
|
1144a8efe7
|
[Bugfix] Temporarily disable gptq_bitblas on ROCm (#17411)
Signed-off-by: Yan Cangang <nalanzeyu@gmail.com>
|
2025-04-30 19:51:45 -07:00 |
|
Gregory Shtrasberg
|
08fb5587b4
|
[Bugfix][ROCm] Fix import error on ROCm (#17495)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-04-30 19:51:42 -07:00 |
|
Siyuan Liu
|
dbc18e7816
|
[CI][TPU] Skip Multimodal test (#17488)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
|
2025-04-30 19:51:39 -07:00 |
|
Alex Brooks
|
02bd654846
|
[Misc] Rename Audios -> Audio in Qwen2audio Processing (#17507)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
|
2025-04-30 19:51:36 -07:00 |
|
Rahul Tuli
|
200bbf92e8
|
Bump Compressed Tensors version to 0.9.4 (#17478)
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-04-30 15:24:45 -07:00 |
|
Chen Zhang
|
81ecf425f0
|
[v1][Spec Decode] Make sliding window compatible with eagle prefix caching (#17398)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-04-30 18:25:53 +00:00 |
|
David Xia
|
42d9a2c4c7
|
doc: fix bug report Github template formatting (#17486)
Signed-off-by: David Xia <david@davidxia.com>
|
2025-04-30 10:03:20 -07:00 |
|
Reid
|
2ac74d098e
|
[doc] add install tips (#17373)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-30 17:02:41 +00:00 |
|
Gregory Shtrasberg
|
584f5fb4c6
|
[Bugfix][ROCm] Restrict ray version due to a breaking release (#17480)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-04-30 09:59:06 -07:00 |
|
zh Wang
|
d586ddc691
|
[BugFix] Fix authorization of openai_transcription_client.py (#17321)
Signed-off-by: zh Wang <rekind133@outlook.com>
|
2025-04-30 09:51:05 -07:00 |
|
Michael Goin
|
0b7e701dd4
|
[Docs] Update optimization.md doc (#17482)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-30 09:34:02 -07:00 |
|
Russell Bryant
|
947f2f5375
|
[V1] Allow turning off pickle fallback in vllm.v1.serial_utils (#17427)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-30 16:10:54 +00:00 |
|
Pete Savage
|
739e03b344
|
[Bugfix] Fixed mistral tokenizer path when pointing to file (#17457)
Signed-off-by: Pete Savage <psavage@redhat.com>
|
2025-04-30 08:08:37 -07:00 |
|
Aaron Pham
|
da4e7687b5
|
[Fix] Support passing args to logger (#17425)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-04-30 08:06:58 -07:00 |
|
Russell Bryant
|
39317cf42b
|
[Docs] Add command for running mypy tests from CI (#17475)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-04-30 08:06:09 -07:00 |
|
Chauncey
|
2990cee95b
|
[Feature] The Qwen3 reasoning parser supports guided decoding (#17466)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-30 07:48:21 -07:00 |
|
Alec
|
0be6d05b5e
|
[V1][Metrics] add support for kv event publishing (#16750)
Signed-off-by: alec-flowers <aflowers@nvidia.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Mark McLoughlin <markmc@redhat.com>
|
2025-04-30 07:44:45 -07:00 |
|
Marko Rosenmueller
|
77073c77bc
|
[Core] Prevent side-channel attacks via cache salting (#17045)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
|
2025-04-30 20:27:21 +08:00 |
|
Nicolò Lucchesi
|
a7d5b016bd
|
[TPU][V1][CI] Update regression test baseline for v6 CI (#17064)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-04-30 04:03:22 -07:00 |
|
rongfu.leng
|
d803786731
|
[V1][Bugfix]: vllm v1 verison metric num_gpu_blocks is None (#15755)
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
|
2025-04-30 18:20:39 +08:00 |
|
Chauncey
|
1534d389af
|
[Misc] Remove deprecated files (#17447)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-30 01:52:19 -07:00 |
|
Lu Fang
|
ece5a8b0b6
|
Make the _apply_rotary_emb compatible with dynamo (#17435)
|
2025-04-30 07:52:48 +00:00 |
|
Marco
|
54072f315f
|
[MODEL ADDITION] Ovis2 Model Addition (#15826)
Signed-off-by: Marco <121761685+mlinmg@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-04-30 07:33:29 +00:00 |
|
Chauncey
|
be633fba0f
|
[Bugfix] Fix AttributeError: 'State' object has no attribute 'engine_client' (#17434)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-30 00:11:04 -07:00 |
|
Kunshang Ji
|
ed6cfb90c8
|
[Hardware][Intel GPU] Upgrade to torch 2.7 (#17444)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Qiming Zhang <qiming1.zhang@intel.com>
|
2025-04-30 00:03:58 -07:00 |
|
Kunshang Ji
|
6ed9f6047e
|
[Intel GPU] [CI]Fix XPU ci, setuptools >=80.0 have build issue (#17298)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-04-29 22:54:10 -07:00 |
|
Michael Goin
|
a44c4f1d2f
|
Support LoRA for Mistral3 (#17428)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-29 21:10:30 -07:00 |
|
Huy Do
|
88fcf00dda
|
Fix some speculative decode tests with tl.dot (#17371)
Signed-off-by: Huy Do <huydhn@gmail.com>
|
2025-04-29 19:41:02 -07:00 |
|
Harry Mellor
|
d1f569b1b9
|
Fix call to logger.info_once (#17416)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 19:39:18 -07:00 |
|
Harry Mellor
|
13698db634
|
Improve configs - ModelConfig (#17130)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-30 10:38:22 +08:00 |
|
Huy Do
|
2c4f59afc3
|
Update PyTorch to 2.7.0 (#16859)
|
2025-04-29 19:08:04 -07:00 |
|
Gabriel Marinho
|
1c2bc7ead0
|
Truncation control for embedding models (#14776)
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-04-30 09:24:57 +08:00 |
|
Kevin H. Luu
|
4055130a85
|
[release] Always git fetch all to get latest tag on TPU release (#17322)
|
2025-04-29 17:52:11 -07:00 |
|
Benjamin Chislett
|
34120f5acd
|
[V1][Feature] Enable Speculative Decoding with Structured Outputs (#14702)
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>
|
2025-04-30 00:02:10 +00:00 |
|
Harry Mellor
|
7489ec0bab
|
Remove Bamba 9B from CI (#17407)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 21:10:31 +00:00 |
|
Bryan Lu
|
70788bdbdc
|
[V1][Spec Decode] Apply torch.compile & cudagraph to EAGLE (#17211)
Signed-off-by: Bryan Lu <yuzhelu@amazon.com>
|
2025-04-29 21:10:00 +00:00 |
|
Dilip Gowda Bhagavan
|
c9c1b59e59
|
Fix: Python package installation for opentelmetry (#17049)
Signed-off-by: Dilip Gowda Bhagavan <dilip.bhagavan@ibm.com>
|
2025-04-29 20:20:24 +00:00 |
|
Harry Mellor
|
0350809f3a
|
Remove Falcon3 2x7B from CI (#17404)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 19:52:25 +00:00 |
|
Harry Mellor
|
a6977dbd15
|
Simplify (and fix) passing of guided decoding backend options (#17008)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 19:02:23 +00:00 |
|
Isotr0py
|
2fa2a50bf9
|
[Bugfix] Fix Minicpm-O-int4 GPTQ model inference (#17397)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-04-29 18:21:42 +00:00 |
|
Reid
|
08e15defa9
|
[CI/Build] Add retry mechanism for add-apt-repository (#17107)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-04-29 10:40:52 -07:00 |
|
Aaron Pham
|
b37685afbb
|
[CI] Uses Python 3.11 for TPU (#17359)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-04-29 17:39:16 +00:00 |
|
Nicolò Lucchesi
|
792595b59d
|
[TPU][V1][CI] Replace python3 setup.py develop with standard pip install --e on TPU (#17374)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-04-29 10:36:48 -07:00 |
|
casinca
|
0c1c788312
|
[Doc][Typo] Fixing label in new model requests link in overview.md (#17400)
|
2025-04-29 10:29:48 -07:00 |
|
Russell Bryant
|
56d64fbe30
|
[Docs] Propose a deprecation policy for the project (#17063)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-04-29 10:29:44 -07:00 |
|
Alexei-V-Ivanov-AMD
|
608968b7c5
|
Enabling multi-group kernel tests. (#17115)
Signed-off-by: Alexei V. Ivanov <alexei.ivanov@amd.com>
|
2025-04-29 10:27:27 -07:00 |
|
TY-AMD
|
06ffc7e1d3
|
[Misc][ROCm] Exclude cutlass_mla_decode for ROCm build (#17289)
Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>
|
2025-04-29 10:26:42 -07:00 |
|
Qiming Zhang
|
d3cf61b89b
|
fix gemma3 results all zero (#17364)
Signed-off-by: mayuyuace <qiming1.zhang@intel.com>
|
2025-04-29 09:40:25 -07:00 |
|