xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-30 10:07:03 +08:00

Author	SHA1	Message	Date
Angela Yi	db74d60490	[Bugfix] Add fake mode around passes (#23349 ) Signed-off-by: angelayi <yiangela7@gmail.com>	2025-08-28 11:25:56 -04:00
elvischenv	24d0c9e6ed	[NVIDIA][torch.compile] Support Flashinfer TRTLLM FP8-q/kv NVFP4-out Attention Kernel (#22703 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-08-22 22:09:05 +00:00
elvischenv	03752dba8f	[NVIDIA] Support Flashinfer TRTLLM FP8-q/kv/out Attention Kernel (#21716 ) Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-08-19 08:22:15 -04:00
Richard Zou	8026a335a1	[BugFix] Update AttnFusionPass cache key (#21947 ) Signed-off-by: Richard Zou <zou3519@gmail.com>	2025-08-01 07:11:29 -07:00
Jee Jee Li	1caca5a589	[Misc] Add SPDX-FileCopyrightText (#20428 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-07-04 07:40:42 +00:00
Luka Govedič	f98548b9da	[torch.compile][ROCm] Fuse quantization onto attention using a torch.compile pass (#16756 ) Signed-off-by: Luka Govedič <lgovedic@redhat.com> Co-authored-by: Sage Moore <sage@neuralmagic.com>	2025-06-12 08:31:04 -07:00