xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-16 12:45:01 +08:00

Author	SHA1	Message	Date
Hosang	dd5fa7e04f	[ROCm][Kernel][V1] Enable AMD Radeon GPU Custom Paged Attention on v1 (#17004 ) Signed-off-by: Hosang Yoon <hosang.yoon@amd.com>	2025-05-21 08:35:00 -07:00
Thomas Parnell	e6b8e65d2d	[Bugfix] Fix fp8 tests for triton_unified_attention for Triton 3.3 (#18013 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-05-15 13:26:34 +08:00
tracelogfb	246e3e0a36	fix broken test vllm:test_kernels - test_attention_selector.py::test_flash_attn (#17873 ) Co-authored-by: Stephen Chen <tracelog@meta.com>	2025-05-10 10:46:54 +08:00
vllmellm	3c9396a64f	[FEAT][ROCm]: Support AITER MLA on V1 Engine (#17523 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by: qli88 <qiang.li2@amd.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>	2025-05-09 10:42:05 +08:00
Mengqing Cao	f9bc5a0693	[Bugfix] Fix triton import with local TritonPlaceholder (#17446 ) Signed-off-by: Mengqing Cao <cmq0113@163.com>	2025-05-06 17:53:09 +08:00
Happy	9869453c42	Update test_flash_attn.py (#17102 ) Signed-off-by: ShuaibinLi <lishuaibin@live.cn>	2025-04-26 22:17:35 +00:00
Shu Wang	9e96f56efb	Allocate kv_cache with stride order (#16605 ) Signed-off-by: shuw <shuw@nvidia.com>	2025-04-25 22:03:31 -07:00
Michael Goin	6317a5174a	Categorize `tests/kernels/` based on kernel type (#16799 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-04-23 09:21:07 -04:00