Michael Goin
f6818a92cb
[UX] Move Dockerfile DeepGEMM install to tools/install_deepgemm.sh ( #23360 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-22 20:52:50 -06:00
Zhewen Li
0483fabc74
[CI/Build] add EP dependencies to docker ( #21976 )
...
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-08-22 13:34:40 -07:00
Michael Goin
50df09fe13
Update to flashinfer-python==0.2.12 and disable AOT compile for non-release image ( #23129 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-20 08:05:54 -04:00
Eli Uriegas
76144adf76
ci: Add CUDA + arm64 release builds ( #21201 )
...
Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
2025-08-15 23:16:23 +00:00
Harry Mellor
e8b40c7fa2
[CI] Remove duplicated docs build from buildkite ( #22924 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-15 05:58:06 -07:00
Frank Wang
ba81acbdc1
[Bugfix] Bump DeepGEMM Version to Fix SMXX Layout Issues ( #22606 )
...
Signed-off-by: frankwang28 <frank.wbb@hotmail.com>
2025-08-12 15:43:06 -07:00
Po-Han Huang (NVIDIA)
dc5e4a653c
Upgrade FlashInfer to v0.2.11 ( #22613 )
...
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-08-11 19:58:41 -07:00
Doug Smith
d1af8b7be9
enable Docker-aware precompiled wheel setup ( #22106 )
...
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-08-10 16:29:02 -07:00
Michael Goin
e8961e963a
Update flashinfer-python==0.2.10 ( #22389 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-06 18:10:24 -07:00
Michael Goin
a7cb6101ca
[CI/Build] Update flashinfer to 0.2.9 ( #22233 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-05 09:39:38 -07:00
Michael Goin
c494f96fbc
Use UV_LINK_MODE=copy in Dockerfile to avoid hardlink fail ( #22128 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-08-05 06:57:10 -07:00
Simon Mo
da31f6ad3d
Revert precompile wheel changes ( #22055 )
2025-08-01 08:26:24 +00:00
Matthew Bonanni
e360316ab9
Add DeepGEMM to Dockerfile in vllm-base image ( #21533 )
...
Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-07-31 18:01:55 -07:00
Doug Smith
58bb902186
fix(setup): improve precompiled wheel setup for Docker builds ( #22025 )
...
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-31 09:52:48 -07:00
Daniele
d2aab336ad
[CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES ( #21599 )
...
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2025-07-31 15:00:08 +08:00
Doug Smith
a1873db23d
docker: docker-aware precompiled wheel support ( #21127 )
...
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-29 14:45:19 -07:00
Michael Goin
a33ea28b1b
Add flashinfer_python to CUDA wheel requirements ( #21389 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-29 12:51:58 -07:00
weiliang
01c753ed98
update flashinfer to v0.2.9rc2 ( #21701 )
...
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
2025-07-28 19:31:47 +00:00
weiliang
2dd72d23d9
update flashinfer to v0.2.9rc1 ( #21485 )
...
Signed-off-by: Weiliang Liu <weiliangl@nvidia.com>
2025-07-24 14:06:11 -07:00
elvischenv
5a19a6c670
[Fix] Update mamba_ssm to 2.2.5 ( #21421 )
...
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2025-07-24 03:25:41 -07:00
cjackal
526078a96c
bump flashinfer to v0.2.8 ( #21385 )
...
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2025-07-24 03:20:38 -07:00
Matthew Bonanni
aa08a954f9
[Bugfix] Fix casing warning ( #21468 )
...
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
2025-07-23 20:41:23 -07:00
Kay Yan
8188196a1c
[CI] Cleanup modelscope version constraint in Dockerfile ( #21243 )
...
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-07-20 20:13:02 -07:00
Michael Goin
a50d918225
[Docker] Allow FlashInfer to be built in the ARM CUDA Dockerfile ( #21013 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-16 19:37:13 -07:00
Peter Pan
1eb2b9c102
[CI] update typos config for CI pre-commit and fix some spells ( #20919 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-15 21:12:40 -07:00
Doug Smith
7976446015
Add Dockerfile argument for VLLM_USE_PRECOMPILED environment ( #20943 )
...
Signed-off-by: dougbtv <dosmith@redhat.com>
2025-07-15 19:53:57 -07:00
Michael Goin
cf75cd2098
[CI Bugfix] Specify same TORCH_CUDA_ARCH_LIST for flashinfer aot and install ( #20772 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-11 01:16:01 +00:00
Michael Goin
4b9a9435bb
Update Dockerfile FlashInfer to v0.2.8rc1 ( #20718 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-10 08:09:02 -07:00
Michael Goin
b7d9e9416f
[CI/Build] Fix FlashInfer double build in Dockerfile ( #20651 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-07-09 17:41:56 -06:00
Peter Pan
5561681d04
[CI] add kvcache-connector dependency definition and add into CI build ( #18193 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-07-04 06:49:18 -07:00
Jee Jee Li
1819fbda63
[Quantization] Bump to use latest bitsandbytes ( #20424 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-07-03 21:58:46 +08:00
Tyler Michael Smith
bdb84e26b0
[Bugfix] Fixes for FlashInfer's TORCH_CUDA_ARCH_LIST ( #20136 )
...
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
2025-07-02 17:15:11 -07:00
Fabien Dupont
3c545c0c3b
[CI/Build] Allow hermetic builds ( #18064 )
...
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Fabien Dupont <fabiendupont@pm.me>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Elias Levy <eliaslevy@google.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-06-27 09:04:39 -07:00
Michael Goin
296ce95d8e
[CI] Add SM120 to the Dockerfile ( #19794 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-25 16:23:56 -07:00
Michael Goin
497a91e9f7
[CI] Update FlashInfer to 0.2.6.post1 ( #19297 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-06-11 22:57:28 +08:00
Cyrus Leung
7d9216495c
[Doc] Update references to doc files ( #18637 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-05-23 15:49:21 -07:00
Huy Do
1645b60196
Use prebuilt FlashInfer x86_64 PyTorch 2.7 CUDA 12.8 wheel for CI ( #18537 )
...
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-05-23 21:17:16 +00:00
Harry Mellor
a1fe24d961
Migrate docs from Sphinx to MkDocs ( #18145 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-05-23 02:09:53 -07:00
Tyler Michael Smith
6e588da0f4
[Build/CI] Fix CUDA 11.8 build ( #17679 )
...
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
2025-05-22 12:13:54 -07:00
Kebe
371376f996
[Build] fix Dockerfile shell ( #18402 )
2025-05-21 07:32:06 -07:00
Simon Mo
47fda6d089
[Build] Supports CUDA 12.6 and 11.8 after Blackwell Update ( #18316 )
...
Signed-off-by: simon-mo <simon.mo@hey.com>
2025-05-18 23:19:33 -07:00
Michael Goin
dcfe95234c
Update Dockerfile to build for Blackwell ( #18095 )
2025-05-17 00:23:25 -07:00
Bowen Wang
7fdfa01530
[Sampler] Adapt to FlashInfer 0.2.3 sampler API ( #15777 )
...
Signed-off-by: Bowen Wang <abmfy@icloud.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
2025-05-16 15:14:03 -07:00
Huy Do
2c4f59afc3
Update PyTorch to 2.7.0 ( #16859 )
2025-04-29 19:08:04 -07:00
Reid
08e15defa9
[CI/Build] Add retry mechanism for add-apt-repository ( #17107 )
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-04-29 10:40:52 -07:00
Lennart K. M. Schulz
d1aeea7553
[Bugfix] Fix missing ARG in Dockerfile for arm64 platforms ( #17261 )
...
Signed-off-by: lkm-schulz <44176356+lkm-schulz@users.noreply.github.com>
2025-04-27 19:38:14 -07:00
Sangyeon Cho
b07d741661
[CI/Build] workaround for CI build failure ( #17070 )
...
Signed-off-by: csy1204 <josang1204@gmail.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-04-23 16:14:18 -07:00
rongfu.leng
7bdfd29a35
[Misc] add collect_env to cli and docker image ( #16759 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-04-17 22:13:35 -07:00
rongfu.leng
96bb8aa68b
[Bugfix] fix gpu docker image mis benchmarks dir ( #16628 )
...
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
2025-04-15 21:21:14 -07:00
Harry Mellor
e6e3c55ef2
Move dockerfiles into their own directory ( #14549 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 13:47:32 -07:00