xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 04:15:01 +08:00

Author	SHA1	Message	Date
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
qizixi	a2a5f79e09	Optimize triton unified attention performance for sliding window attention (#24390 ) Signed-off-by: zixi-qi <qizixi@meta.com>	2025-09-19 13:07:26 -06:00
jvlunteren	01a583fea4	[Kernel] Decouple Tile Size from Block Size in Triton Unified Attention Kernel (#21197 ) Signed-off-by: Jan van Lunteren <jvl@zurich.ibm.com>	2025-09-18 14:27:01 +00:00
Michael Goin	0fe85087a9	[CI Perf] Prune tests in `tests/kernels/attention/` (#22936 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-08-14 21:34:53 -06:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
Hongxia Yang	269d901734	[Bugfix][ROCm] fix the power of 2 exception from triton_unified_attention.py when running llama4 models and unit test fix (#18100 ) Signed-off-by: Hongxia Yang <hongxia.yang@amd.com> Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>	2025-05-29 07:21:46 +08:00
Thomas Parnell	e6b8e65d2d	[Bugfix] Fix fp8 tests for triton_unified_attention for Triton 3.3 (#18013 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>	2025-05-15 13:26:34 +08:00