47 Commits

Author SHA1 Message Date
Kyuyeun Kim
df4d3a44a8
[TPU] Rename path to tpu platform (#28452)
Signed-off-by: Kyuyeun Kim <kyuyeunk@google.com>
2025-11-11 19:16:47 +00:00
Cyrus Leung
6ebffafbb6
[Misc] Clean up more utils (#27567)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-27 15:30:38 +00:00
Wentao Ye
52efc34ebf
[Log] Optimize Startup Log (#26740)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
2025-10-24 19:27:04 -04:00
Isotr0py
6ac5e06f7c
[Chore] Clean up pytorch helper functions in vllm.utils (#26908)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: isotr0py <2037008807@qq.com>
2025-10-18 09:48:22 -07:00
Cyrus Leung
4d4d6bad19
[Chore] Separate out vllm.utils.importlib (#27022)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-17 00:48:59 +00:00
wangxiyuan
577d498212
[Plugin] Make plugin group clear (#26757)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-10-14 07:49:59 +00:00
Harry Mellor
8fcaaf6a16
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-12 09:51:31 -07:00
Aydin Abiar
76afe4edf8
[Bugfix] Fix vllm bench ... on CPU-only head nodes (#25283)
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Aydin Abiar <aydin@anyscale.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-10-08 16:06:42 +00:00
Utkarsh Sharma
335b28f7d1
[TPU] Rename tpu_commons to tpu_inference (#26279)
Signed-off-by: Utkarsh Sharma <utksharma@google.com>
Co-authored-by: Utkarsh Sharma <utksharma@google.com>
Co-authored-by: Chengji Yao <chengjiyao@google.com>
2025-10-07 23:30:52 -07:00
Harry Mellor
d6953beb91
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-05 07:06:22 -07:00
Matthew Bonanni
2aaa423842
[Attention] Move Backend enum into registry (#25893)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-10-02 20:32:24 -07:00
Woosuk Kwon
4172235ab7
[V0 deprecation] Deprecate V0 Neuron backend (#21159)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-09-06 16:15:18 -07:00
wenxindongwork
8f0d516715
[TPU] Support Pathways in vLLM (#21417)
Signed-off-by: wenxindongwork <wenxindong@google.com>
2025-07-30 10:02:12 -07:00
Woosuk Kwon
4de7146351
[V0 deprecation] Remove V0 HPU backend (#21131)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-17 16:37:36 -07:00
Yang Yang
6e2c19ce22
[Refactor]Abstract Platform Interface for Distributed Backend and Add xccl Support for Intel XPU (#19410)
Signed-off-by: dbyoung18 <yang5.yang@intel.com>
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
2025-07-07 04:32:32 +00:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText (#19100)
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Cyrus Leung
503f8487c2
[Misc] Reduce logs on startup (#18649)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-24 23:03:53 -07:00
Ning Xie
60cad94b86
[Hardware] correct method signatures for HPU,ROCm,XPU (#18551)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
2025-05-22 22:31:59 -07:00
Satyajith Chilappagari
043e4c4955
Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357)
Signed-off-by: Satyajith Chilappagari <satchill@amazon.com>
Co-authored-by: Aaron Dou <yzdou@amazon.com>
Co-authored-by: Shashwat Srijan <sssrijan@amazon.com>
Co-authored-by: Chongming Ni <chongmni@amazon.com>
Co-authored-by: Amulya Ballakur <amulyaab@amazon.com>
Co-authored-by: Patrick Lange <patlange@amazon.com>
Co-authored-by: Elaine Zhao <elaineyz@amazon.com>
Co-authored-by: Lin Lin Pan <tailinpa@amazon.com>
Co-authored-by: Navyadhara Gogineni <navyadha@amazon.com>
Co-authored-by: Yishan McNabb <yishanm@amazon.com>
Co-authored-by: Mrinal Shukla <181322398+mrinalks@users.noreply.github.com>
2025-05-07 00:07:30 -07:00
Russell Bryant
b877031d80
Remove openvino support in favor of external plugin (#15339)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-22 14:06:39 -07:00
youkaichao
6eaf93020d
[platforms] improve rocm debugging info (#14257) 2025-03-04 21:32:18 -08:00
Michael Goin
6247bae6c6
[Bugfix] Restrict MacOS CPU detection (#14210)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-03-04 22:25:27 +08:00
youkaichao
ac65bc92df
[platform] add debug logging during inferring the device type (#14195)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-03-04 18:39:16 +08:00
Alex Brooks
9621667874
[Misc] Warn if the vLLM version can't be retrieved (#13501)
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
2025-02-20 06:24:48 +00:00
Isotr0py
d67cc21b78
[Bugfix][Platform][CPU] Fix cuda platform detection on CPU backend edge case (#13358)
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-02-16 18:55:27 +00:00
youkaichao
0408efc6d0
[Misc] Improve error message for incorrect pynvml (#12809)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-06 15:23:50 +08:00
youkaichao
ad4a9dc817
[cuda] manually import the correct pynvml module (#12679)
fixes problems like https://github.com/vllm-project/vllm/pull/12635 and
https://github.com/vllm-project/vllm/pull/12636 and
https://github.com/vllm-project/vllm/pull/12565

---------

Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-03 15:58:21 +08:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**

commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:18:24 2025 -0500

    Add SPDX license headers to python source files
    
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
    also be easily used by tools to help manage license compliance.
    
The Linux Foundation runs license scans against the codebase to help
ensure
    we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
    
    More information can be found on the SPDX site:
    
    - https://spdx.dev/learn/handling-license-info/
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date:   Fri Jan 31 14:36:32 2025 -0500

    Check for SPDX headers using pre-commit
    
    Signed-off-by: Russell Bryant <rbryant@redhat.com>

---------

Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Kevin H. Luu
64ea24d0b3
[ci/lint] Add back default arg for pre-commit (#12279)
Signed-off-by: kevin <kevin@anyscale.com>
2025-01-22 01:15:27 +00:00
Mengqing Cao
c64612802b
[Platform] improve platforms getattr (#12264)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
2025-01-21 14:42:41 +00:00
Işık
af69a6aded
fix: update platform detection for M-series arm based MacBook processors (#12227)
Signed-off-by: isikhi <huseyin.isik000@gmail.com>
2025-01-20 22:23:28 +00:00
Kunshang Ji
61af633256
[BUGFIX] Fix UnspecifiedPlatform package name (#11916)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-01-10 16:20:46 +08:00
youkaichao
b12e87f942
[platforms] enable platform plugins (#11602)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-30 20:24:45 +08:00
Michael Goin
7090c27bb2
[Bugfix] Only require XGrammar on x86 (#10865)
Signed-off-by: mgoin <michael@neuralmagic.com>
2024-12-03 10:32:21 -08:00
Conroy Cheers
f5792c7c4a
[Hardware][NVIDIA] Add non-NVML CUDA mode for Jetson (#9735)
Signed-off-by: Conroy Cheers <conroy@corncheese.org>
2024-11-26 10:26:28 -08:00
Mengqing Cao
8c1fb50705
[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
2024-11-19 11:22:26 +08:00
Konrad Zawora
a02a50e6e5
[Hardware][Intel-Gaudi] Add Intel Gaudi (HPU) inference backend (#6143)
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Bob Zhu <bob.zhu@intel.com>
Signed-off-by: zehao-intel <zehao.huang@intel.com>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>
Co-authored-by: Michal Adamczyk <madamczyk@habana.ai>
Co-authored-by: Marceli Fylcek <mfylcek@habana.ai>
Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>
Co-authored-by: Vivek Goel <vgoel@habana.ai>
Co-authored-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Dominika Olszewska <dolszewska@habana.ai>
Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com>
Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com>
Co-authored-by: Jan Kaniecki <jkaniecki@habana.ai>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com>
Co-authored-by: Krzysztof Wisniewski <kwisniewski@habana.ai>
Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com>
Co-authored-by: Ilia Taraban <tarabanil@gmail.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>
Co-authored-by: Jakub Maksymczuk <jmaksymczuk@habana.ai>
Co-authored-by: Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com>
Co-authored-by: Sun Choi <schoi@habana.ai>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Bob Zhu <41610754+czhu15@users.noreply.github.com>
Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Co-authored-by: Zehao Huang <zehao.huang@intel.com>
Co-authored-by: Andrzej Kotłowski <Andrzej.Kotlowski@intel.com>
Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com>
Co-authored-by: Nir David <ndavid@habana.ai>
Co-authored-by: Yu-Zhou <yu.zhou@intel.com>
Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>
Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>
Co-authored-by: Marcin Swiniarski <mswiniarski@habana.ai>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>
Co-authored-by: Jacek Czaja <jczaja@habana.ai>
Co-authored-by: Yuan <yuan.zhou@outlook.com>
2024-11-06 01:09:10 -08:00
Yan Ma
04a3ae0aca
[Bugfix] Fix multi nodes TP+PP for XPU (#8884)
Signed-off-by: YiSheng5 <syhm@mail.ustc.edu.cn>
Signed-off-by: yan ma <yan.ma@intel.com>
Co-authored-by: YiSheng5 <syhm@mail.ustc.edu.cn>
2024-10-29 21:34:45 -07:00
Mengqing Cao
5cbdccd151
[Hardware][openvino] is_openvino --> current_platform.is_openvino (#9716) 2024-10-26 10:59:06 +00:00
xendo
9dbcce84a7
[Neuron] [Bugfix] Fix neuron startup (#9374)
Co-authored-by: Jerzy Zagorski <jzagorsk@amazon.com>
2024-10-22 12:51:41 +00:00
Tyler Titsworth
260024a374
[Bugfix][Intel] Fix XPU Dockerfile Build (#7824)
Signed-off-by: tylertitsworth <tyler.titsworth@intel.com>
Co-authored-by: youkaichao <youkaichao@126.com>
2024-09-27 23:45:50 -07:00
Li, Jiang
0b952af458
[Hardware][Intel] Support compressed-tensor W8A8 for CPU backend (#7257) 2024-09-11 09:46:46 -07:00
Woosuk Kwon
eeee1c3b1a
[TPU] Avoid initializing TPU runtime in is_tpu (#7763) 2024-08-21 21:31:49 -07:00
youkaichao
e54ebc2f8f
[doc] fix doc build error caused by msgspec (#7659) 2024-08-19 17:50:59 -07:00
youkaichao
4d2dc5072b
[hardware] unify usage of is_tpu to current_platform.is_tpu() (#7102) 2024-08-13 00:16:42 -07:00
Woosuk Kwon
42de2cefcb
[Misc] Add a wrapper for torch.inference_mode (#6618) 2024-07-21 18:43:11 -07:00
youkaichao
482045ee77
[hardware][misc] introduce platform abstraction (#6080) 2024-07-02 20:12:22 -07:00