Michael Goin
e8961e963a
Update flashinfer-python==0.2.10 ( #22389 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-06 18:10:24 -07:00
Michael Goin
a7cb6101ca
[CI/Build] Update flashinfer to 0.2.9 ( #22233 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-05 09:39:38 -07:00
Simon Mo
da31f6ad3d
Revert precompile wheel changes ( #22055 )
2025-08-01 08:26:24 +00:00
Michael Goin
0bd409cf01
Move flashinfer-python to optional extra vllm[flashinfer] ( #21959 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-31 18:02:11 -07:00
Doug Smith
58bb902186
fix(setup): improve precompiled wheel setup for Docker builds ( #22025 )
...
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-31 09:52:48 -07:00
Doug Smith
b9b753e7a7
For VLLM_USE_PRECOMPILED, only compiled .so files should be extracted ( #21964 )
2025-07-30 13:04:40 -07:00
Doug Smith
a1873db23d
docker: docker-aware precompiled wheel support ( #21127 )
...
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-29 14:45:19 -07:00
Benjamin Bartels
b194557a6c
Adds parallel model weight loading for runai_streamer ( #21330 )
...
Signed-off-by: bbartels <benjamin@bartels.dev>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-22 08:15:53 -07:00
Woosuk Kwon
4de7146351
[V0 deprecation] Remove V0 HPU backend ( #21131 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-07-17 16:37:36 -07:00
Patrick von Platen
e7e3e6d263
Voxtral ( #20970 )
...
Signed-off-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-07-15 07:35:30 -07:00
Sanger Steel
72d14d0eed
[Frontend] [Core] Integrate Tensorizer in to S3 loading machinery, allow passing arbitrary arguments during save/load ( #19619 )
...
Signed-off-by: Sanger Steel <sangersteel@gmail.com>
Co-authored-by: Eta <esyra@coreweave.com>
2025-07-07 22:47:43 -07:00
Isotr0py
8711bc5e68
[Misc] Add packages for benchmark as extra dependency ( #19089 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-06-04 04:18:48 -07:00
Simon Mo
02f0c7b220
[Misc] Add SPDX-FileCopyrightText ( #19100 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-06-03 11:20:17 -07:00
Daniele
43ff405b90
[CI/Build] remove regex from build dependencies ( #18945 )
...
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-05-30 04:02:50 -07:00
Luka Govedič
a3896c7f02
[Build] Fixes for CMake install ( #18570 )
2025-05-27 20:49:24 -04:00
Feng XiaoLong
4fc1bf813a
[Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking ( #18454 )
...
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
2025-05-23 16:16:26 -07:00
Huy Do
2c4f59afc3
Update PyTorch to 2.7.0 ( #16859 )
2025-04-29 19:08:04 -07:00
Lucas Wilkinson
d8bccde686
[BugFix] Fix vllm_flash_attn install issues ( #17267 )
...
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
2025-04-27 17:27:56 -07:00
Aaron Pham
e782e0a170
[Chore] added stubs for vllm_flash_attn during development mode ( #17228 )
...
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2025-04-26 07:45:26 -07:00
Isotr0py
4e5a0f6ae2
[Misc] Allow using OpenCV as video IO fallback ( #15055 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-04-01 15:55:13 +00:00
Yang Chen
f3aca1ee30
setup correct nvcc version with CUDA_HOME ( #15725 )
...
Signed-off-by: Yang Chen <yangche@fb.com>
2025-04-01 06:09:40 -07:00
yihong
e7ae3bf3d6
fix: better install requirement for install in setup.py ( #15796 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-31 05:13:32 -07:00
Manish Sethi
761702fd19
[Core] Integrate fastsafetensors loader for loading model weights ( #10647 )
...
Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com>
2025-03-24 08:08:02 -07:00
Russell Bryant
b877031d80
Remove openvino support in favor of external plugin ( #15339 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-22 14:06:39 -07:00
Russell Bryant
0a74bfce9c
setup.py: drop assumption about local main branch ( #14692 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-17 01:37:42 -07:00
Harry Mellor
206e2577fa
Move requirements into their own directory ( #12547 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-08 16:44:35 +00:00
Isotr0py
63137cd922
[Build] Add nightly wheel fallback when latest commit wheel unavailable ( #14358 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-03-06 22:10:57 -08:00
Cody Yu
f35f8e2242
[Build] Make sure local main branch is synced when VLLM_USE_PRECOMPILED=1 ( #13921 )
...
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
2025-03-03 16:43:14 +08:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
Michael Goin
ca377cf1b9
Use CUDA 12.4 as default for release and nightly wheels ( #12098 )
2025-02-26 19:06:37 -08:00
Lucas Wilkinson
f95903909f
[Kernel] FlashMLA integration ( #13747 )
...
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
2025-02-27 10:35:08 +08:00
Daniele
81dabf24a8
[CI/Build] force writing version file ( #13544 )
...
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2025-02-19 18:48:03 +08:00
Daniele
a02c86b4dd
[CI/Build] migrate static project metadata from setup.py to pyproject.toml ( #8772 )
2025-02-18 08:02:49 -08:00
Russell Bryant
d46d490c27
[Frontend] Move CLI code into vllm.cmd package ( #12971 )
2025-02-12 23:12:21 -08:00
Cody Yu
60c68df6d1
[Build] Automatically use the wheel of the base commit with Python-only build ( #13178 )
2025-02-12 23:10:28 -08:00
Kevin H. Luu
91e876750e
[misc] Fix setup.py condition to avoid AMD from being mistaken with CPU ( #13022 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2025-02-10 18:06:16 -08:00
Liangfu Chen
c45d398e6f
[CI] Resolve transformers-neuronx version conflict ( #12925 )
2025-02-08 01:41:35 -08:00
wangxiyuan
407b5537db
[Build] Make pypi install work on CPU platform ( #12874 )
2025-02-08 01:15:15 -08:00
Sophie du Couédic
649550f27e
[Build] update requirements of no-device for plugin usage ( #12630 )
...
Signed-off-by: Sophie du Couédic <sop@zurich.ibm.com>
2025-02-04 21:19:12 +08:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com>
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com>
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
Robert Shaw
9b0c4bab36
[Kernel] Triton Configs for Fp8 Block Quantization ( #11589 )
...
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
2025-01-30 11:53:22 -08:00
Harry Mellor
823ab79633
Update pre-commit hooks ( #12475 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-27 17:23:08 -07:00
Lucas Wilkinson
ab5bbf5ae3
[Bugfix][Kernel] Fix CUDA 11.8 being broken by FA3 build ( #12375 )
...
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-01-24 15:27:59 +00:00
Lucas Wilkinson
978b45f399
[Kernel] Flash Attention 3 Support ( #12093 )
...
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-01-23 06:45:48 -08:00
youkaichao
68ad4e3a8d
[Core] Support fully transparent sleep mode ( #11743 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-22 14:39:32 +08:00
Mengqing Cao
4004f144f3
[Build] update requirements of no-device ( #12299 )
...
Signed-off-by: Mengqing Cao <cmq0113@163.com>
2025-01-22 14:29:31 +08:00
Yuan Tang
b8bfa46a18
[Bugfix] Fix issues in CPU build Dockerfile ( #12135 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-01-17 12:54:01 +08:00
Konrad Zawora
1a51b9f872
[HPU][Bugfix] Don't use /dev/accel/accel0 for HPU autodetection in setup.py ( #12046 )
...
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
2025-01-15 02:59:18 +00:00
Wallas Henrique
cfd3219f58
[Hardware][Apple] Native support for macOS Apple Silicon ( #11696 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-01-08 16:35:49 +08:00
youkaichao
ad9f1aa679
[doc] update wheels url ( #11830 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 14:36:49 +08:00