221 Commits

Author SHA1 Message Date
Kunshang Ji
ed6cfb90c8
[Hardware][Intel GPU] Upgrade to torch 2.7 (#17444)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Qiming Zhang <qiming1.zhang@intel.com>
2025-04-30 00:03:58 -07:00
Kunshang Ji
6ed9f6047e
[Intel GPU] [CI]Fix XPU ci, setuptools >=80.0 have build issue (#17298)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-04-29 22:54:10 -07:00
Huy Do
2c4f59afc3
Update PyTorch to 2.7.0 (#16859) 2025-04-29 19:08:04 -07:00
Aaron Pham
b37685afbb
[CI] Uses Python 3.11 for TPU (#17359)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2025-04-29 17:39:16 +00:00
Harry Mellor
4a5e13149a
Update docs requirements (#17379)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-29 11:35:47 +00:00
Gregory Shtrasberg
4464109219
[Build][Bugfix] Restrict setuptools version to <80 (#17320)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-04-29 00:17:23 -07:00
Agata Dobrzyniewicz
c48334d405
[Hardware][Intel-Gaudi] Update hpu-extension and update bucketing system for HPU device (#17186)
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
2025-04-26 05:55:14 -07:00
Cyrus Leung
9d98ab5ec6
[Misc] Inline Molmo requirements (#17190)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-25 16:41:44 +00:00
Harry Mellor
0bd7f8fca5
Bump Transformers to 4.51.3 (#17116)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-25 08:34:34 -07:00
Harry Mellor
0422ce109f
Add :markdownhelp: to EngineArgs docs so markdown docstrings render properly (#17124)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-24 10:28:45 -07:00
Eyshika Agarwal
47bdee409c
Molmo Requirements (#17026)
Signed-off-by: Eyshika Agarwal <eyshikaengineer@gmail.com>
Signed-off-by: eyshika <eyshikaengineer@gmail.com>
2025-04-24 10:08:37 -07:00
Yang Wang
f67e9e9f22
add Dockerfile build vllm against torch nightly (#16936)
Signed-off-by: Yang Wang <elainewy@meta.com>
2025-04-22 19:08:27 -07:00
Isotr0py
83f3c3bd91
[Model] Refactor Phi-4-multimodal to use merged processor and support V1 (#15477)
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-19 02:26:11 -07:00
Tarun Kumar
e37073efd7
Add property-based testing for vLLM endpoints using an API defined by an OpenAPI 3.1 schema (#16721)
Signed-off-by: Tarun Kumar <takumar@redhat.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
2025-04-17 21:08:27 -07:00
Nick Hill
05fcd1b430
[V1][Perf] Faster incremental detokenization (#15137)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-17 07:45:24 -07:00
Sage Moore
44fa4d556c
[ROCM] Bind triton version to 3.2 in requirements-built.txt (#16664)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
2025-04-16 14:05:28 +08:00
Shinichi Hemmi
3badb0213b
[Model] Add PLaMo2 (#14323)
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: shemmi <shemmi@preferred.jp>
Co-authored-by: Kento Nozawa <nzw0301@preferred.jp>
Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp>
Co-authored-by: Calvin Metzger <metzger@preferred.jp>
2025-04-15 19:31:30 -07:00
Taneem Ibrahim
70e7ed841d
[BugFix]: Update minimum pyzmq version (#16549)
Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
2025-04-14 20:06:03 -07:00
Siyuan Liu
c64ee87267
[Hardware][TPU] Add torchvision to tpu dependency file (#16616)
Signed-off-by: Siyuan Liu <lsiyuan@google.com>
2025-04-14 17:50:46 -04:00
courage17340
b1308b84a3
[Model][VLM] Add Kimi-VL model support (#16387)
Signed-off-by: courage17340 <courage17340@163.com>
2025-04-14 21:41:48 +00:00
Harry Mellor
9883a18859
Fix triton install condition on CPU (#16600)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-14 17:06:01 +00:00
Harry Mellor
51baa9c333
Don't install triton on ppc64le platform (#16470)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-11 10:11:00 +00:00
cyyever
ec1f9c8c91
Update Numba to 0.61.2 (#16376)
Signed-off-by: cyy <cyyever@outlook.com>
2025-04-10 07:59:37 +00:00
Roger Wang
0d4d06fe2f
[CI][Bugfix] Pin triton version for CPU (#16384)
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-04-10 04:35:00 +00:00
Russell Bryant
fee5b8d37f
[Build/CI] Add tracing deps to vllm container image (#15224)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-09 19:14:06 +00:00
Chendi.Xue
566f10a929
[CI]Fix hpu docker and numpy version for CI (#16355)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
2025-04-09 17:52:26 +00:00
Chengji Yao
b1eb4ca152
[TPU] Update PyTorch/XLA (#16288)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-04-09 14:46:32 +08:00
Russell Bryant
cb84e45ac7
[Core] Upgrade to xgrammar 0.1.18, add cache size limit (#16283)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-08 19:13:22 -07:00
Chauncey
102bf967f0
[Model] Add smolvlm support (#16017)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
2025-04-08 19:12:17 -07:00
Harry Mellor
90cb44eb02
Update to transformers==4.51.1 (#16257)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-08 06:53:39 -07:00
Isotr0py
f6b32efb7f
[Bugfix] Fix and reorganize broken GGUF tests and bump gguf version (#16194)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-04-08 13:38:13 +08:00
Miles Williams
f94ab12f79
[Misc] Update compressed-tensors to version 0.9.3 (#16196)
Signed-off-by: Miles Williams <42222518+mlsw@users.noreply.github.com>
2025-04-07 19:09:06 -07:00
Nick Hill
7f6d47c1a2
[V1][BugFix] Exit properly if engine core fails during startup (#16137)
Signed-off-by: Nick Hill <nhill@redhat.com>
2025-04-07 15:30:15 -07:00
Lu Fang
55dcce91df
Upstream Llama4 Support to Main (#16113)
Signed-off-by: Aston Zhang <22279212+astonzhang@users.noreply.github.com>
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
Signed-off-by: drisspg <drisspguessous@gmail.com>
Signed-off-by: Jon Swenson <jmswen@gmail.com>
Signed-off-by: Keyun Tong <tongkeyun@gmail.com>
Signed-off-by: Lu Fang <fanglu@meta.com>
Signed-off-by: Xiaodong Wang <xdwang@meta.com>
Signed-off-by: Yang Chen <yangche@fb.com>
Signed-off-by: Ye (Charlotte) Qi <yeq@meta.com>
Signed-off-by: Yong Hoon Shin <yhshin@meta.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Lu Fang <lufang@fb.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Lu Fang <fanglu@fb.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-07 08:06:27 -07:00
Chengji Yao
9bde5ba127
[TPU] Update PyTorch/XLA (#16130)
Signed-off-by: Chengji Yao <chengjiyao@google.com>
2025-04-06 18:25:55 +00:00
yarongmu-google
45b1ff7a25
[Misc][Performance] Advance tpu.txt to the most recent nightly torch … (#16024) 2025-04-03 17:32:54 +00:00
Harry Mellor
55acf86bf8
Fix huggingface-cli[hf-xet] -> huggingface-cli[hf_xet] (#15969)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-02 23:37:30 +00:00
Harry Mellor
51826d51fa
Add minimum version for huggingface_hub to enable Xet downloads (#15873)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-02 02:03:36 -07:00
Russell Bryant
14e53ed11f
[V1] Fix json_object support with xgrammar (#15488)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-04-02 02:00:08 -07:00
Chris Thi
101f1481f9
[Build/CI] Update lm-eval to 0.4.8 (#15912)
Signed-off-by: Chris Thi <chris.c.thi@gmail.com>
2025-04-02 01:47:57 -07:00
Gregory Shtrasberg
a57a3044aa
[ROCm][Build][Bugfix] Bring the base dockerfile in sync with the ROCm fork (#15820)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-04-01 08:56:39 -07:00
Isotr0py
4e5a0f6ae2
[Misc] Allow using OpenCV as video IO fallback (#15055)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-01 15:55:13 +00:00
Harry Mellor
e5ef4fa99a
Upgrade transformers to v4.50.3 (#13905)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 08:59:37 -07:00
yarongmu-google
7c1f760024
[Kernel][TPU][ragged-paged-attn] vLLM code change for PR#8896 (#15659)
Signed-off-by: Yarong Mu <ymu@google.com>
2025-03-28 21:13:15 -07:00
cyyever
8427f70493
Use numba 0.61 for python 3.10+ to support numpy>=2 (#15692)
Signed-off-by: cyy <cyyever@outlook.com>
2025-03-29 12:11:51 +08:00
Rui Qiao
df8d3d1287
[Misc] Restrict ray version dependency and update PP feature warning in V1 (#15556) 2025-03-27 06:21:07 +00:00
Md. Shafi Hussain
3e2f37a69a
Dockerfile.ppc64le changes to move to UBI (#15402)
Signed-off-by: Md. Shafi Hussain <Md.Shafi.Hussain@ibm.com>
2025-03-25 10:15:14 +00:00
Russell Bryant
a09ad90a72
[V1] guidance backend for structured output + auto fallback mode (#14779)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Loc Huynh <jc1da.3011@gmail.com>
Co-authored-by: Michal Moskal <michal@moskal.me>
2025-03-24 21:02:33 -07:00
Manish Sethi
761702fd19
[Core] Integrate fastsafetensors loader for loading model weights (#10647)
Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com>
2025-03-24 08:08:02 -07:00
Russell Bryant
b877031d80
Remove openvino support in favor of external plugin (#15339)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-22 14:06:39 -07:00