xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-09 04:04:57 +08:00

Author	SHA1	Message	Date
Ralf Gommers	7c1ed45848	[CI/Build]: make it possible to build with a free-threaded interpreter (#29241 ) Signed-off-by: Ralf Gommers <ralf.gommers@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>	2025-11-28 15:21:46 -08:00
Abolfazl Shahbazi	d15afc1fd0	Refactor CPU/GPU extension targets for CMake build (#28026 ) Signed-off-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com>	2025-11-08 14:17:35 +08:00
Fadi Arafeh	a663f6ae64	[cpu][perf] Fix low CPU utilization with VLLM_CPU_OMP_THREADS_BIND on AArch64 (#27415 ) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>	2025-10-27 11:14:55 +00:00
Johnny	5234dc7451	[NVIDIA] Blackwell Family (#24673 ) Signed-off-by: Johnny <johnnynuca14@gmail.com> Signed-off-by: johnnynunez <johnnynuca14@gmail.com> Signed-off-by: Johnny <johnnync13@gmail.com> Signed-off-by: Salvatore Cena <cena@cenas.it> Co-authored-by: Aidyn-A <31858918+Aidyn-A@users.noreply.github.com> Co-authored-by: Salvatore Cena <cena@cenas.it>	2025-10-01 10:50:54 -07:00
FengjinChen	79cbcab871	Force use C++17 globally to avoid compilation error (#24823 ) Signed-off-by: chenfengjin <1871653365@qq.com>	2025-09-14 19:30:10 +00:00
Gregory Shtrasberg	5d5d419ca6	[Bugfix][CI/Build][ROCm] Make sure to use the headers from the build folder on ROCm (#22264 ) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>	2025-08-05 20:39:32 -07:00
Huy Do	6c9837a761	Fix cuda_archs_loose_intersection when handling sm_*a (#20207 ) Signed-off-by: Huy Do <huydhn@gmail.com>	2025-06-29 16:52:34 -07:00
Lu Fang	8d1e89d946	[Misc][ROCm] Enforce no unused variable in ROCm C++ files (#19796 ) Signed-off-by: Lu Fang <lufang@fb.com>	2025-06-18 20:25:15 -07:00
Luka Govedič	a3896c7f02	[Build] Fixes for CMake install (#18570 )	2025-05-27 20:49:24 -04:00
Lucas Wilkinson	c7852a6d9b	[Build] Allow shipping PTX on a per-file basis (#18155 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-05-15 16:41:55 -07:00
Kaixi Hou	4fc5c23bb6	[NVIDIA] Support nvfp4 quantization (#12784 )	2025-02-12 19:51:51 -08:00
Lucas Wilkinson	103bd17ac5	[Build] Only build 9.0a for scaled_mm and sparse kernels (#12339 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-01-27 10:40:00 -05:00
tvirolai-amd	cd9d06fb8d	Allow hip sources to be directly included when compiling for rocm. (#12087 )	2025-01-15 16:46:03 -05:00
bnellnm	3cb07a36a2	[Misc] Upgrade to pytorch 2.5 (#9588 ) Signed-off-by: Bill Nell <bill@neuralmagic.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>	2024-10-27 09:44:24 +00:00
Lucas Wilkinson	aeb37c2a72	[CI/Build] Per file CUDA Archs (improve wheel size and dev build times) (#8845 )	2024-10-03 22:55:25 -04:00
Luka Govedič	71c60491f2	[Kernel] Build flash-attn from source (#8245 )	2024-09-20 23:27:10 -07:00
bnellnm	de6f90a13d	[Misc] guard against change in cuda library name (#8609 )	2024-09-20 06:36:30 +08:00
bnellnm	73202dbe77	[Kernel][Misc] register ops to prevent graph breaks (#6917 ) Co-authored-by: Sage Moore <sage@neuralmagic.com>	2024-09-11 12:52:19 -07:00
Jee Jee Li	f80ab3521c	Clean up remaining Punica C information (#7027 )	2024-08-04 15:37:08 -07:00
Matt Wong	dd793d1de5	[Hardware][AMD][CI/Build][Doc] Upgrade to ROCm 6.1, Dockerfile improvements, test fixes (#5422 )	2024-06-25 15:56:15 -07:00
Hongxia Yang	f758aed0e8	[Bugfix][CI/Build][AMD][ROCm]Fixed the cmake build bug which generate garbage on certain devices (#5641 )	2024-06-18 23:21:29 -07:00
bnellnm	5467ac3196	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
Cody Yu	c833101740	[Kernel] Refactor FP8 kv-cache with NVIDIA float8_e4m3 support (#4535 )	2024-05-09 18:04:17 -06:00
Matt Wong	59a6abf3c9	[Hotfix][CI/Build][Kernel] CUDA 11.8 does not support layernorm optimizations (#3782 )	2024-04-08 14:31:02 -07:00
Adrian Abeyta	2ff767b513	Enable scaled FP8 (e4m3fn) KV cache on ROCm (AMD GPU) (#3290 ) Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Co-authored-by: HaiShaw <hixiao@gmail.com> Co-authored-by: AdrianAbeyta <Adrian.Abeyta@amd.com> Co-authored-by: Matthew Wong <Matthew.Wong2@amd.com> Co-authored-by: root <root@gt-pla-u18-08.pla.dcgpu> Co-authored-by: mawong-amd <156021403+mawong-amd@users.noreply.github.com> Co-authored-by: ttbachyinsda <ttbachyinsda@outlook.com> Co-authored-by: guofangze <guofangze@kuaishou.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: jacobthebanana <50071502+jacobthebanana@users.noreply.github.com> Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-04-03 14:15:55 -07:00
mawong-amd	b6d103542c	[Kernel] Layernorm performance optimization (#3662 )	2024-03-30 14:26:38 -07:00
Simon Mo	51c31bc10c	CMake build elf without PTX (#3739 )	2024-03-30 01:53:08 +00:00
bnellnm	3ad438c66f	Fix build when nvtools is missing (#3698 )	2024-03-29 18:52:39 -07:00
bnellnm	9fdf3de346	Cmake based build system (#2830 )	2024-03-18 15:38:33 -07:00

29 Commits