xinyun/vllm - vllm - 丝路新云-代码仓

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-10 02:55:40 +08:00

Author	SHA1	Message	Date
Roger Wang	0ff70821c9	[Core] Deprecate `xformers` (#29262 ) Signed-off-by: Roger Wang <hey@rogerw.io>	2025-11-24 04:18:55 +00:00
rasmith	5e5a7eb16f	[CI/Build] Make test_attention_selector.py run tests on correct platform (#29064 ) Signed-off-by: Randall Smith <ransmith@amd.com> Signed-off-by: rasmith <Randall.Smith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-20 20:45:56 +00:00
Li, Jiang	7f829be7d3	[CPU] Refactor CPU attention backend (#27954 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-11-12 09:43:06 +08:00
Matthew Bonanni	b30dfa03c5	[Attention] Refactor CUDA attention backend selection logic (#24794 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>	2025-11-11 07:40:44 -05:00
Pleaplusone	6cae1e5332	[ROCm][MLA] Support block-size > 1 for AITER MLA backend (#27224 ) Signed-off-by: ganyi <ygan@amd.com> Co-authored-by: wuhuikx <hattie.wu@amd.com>	2025-11-05 10:43:02 -05:00
Wenzheng Bi	ec10fd0abc	[Bugfix] Move current_platform import to avoid python import cache. (#16601 ) Signed-off-by: iwzbi <wzbi@zju.edu.cn>	2025-10-09 10:46:19 +00:00
Matthew Bonanni	76879cc160	[Attention] Implement universal BACKEND_MAP (#25900 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>	2025-10-08 12:00:25 -07:00
Lucas Wilkinson	f80e7866c0	[Misc] Clean up cruft from previous FlashMLA sparse implementation (#26125 ) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>	2025-10-08 10:09:34 +08:00
Cyrus Leung	1e4ecca1d0	[V0 Deprecation] Remove `VLLM_USE_V1` from tests (#26341 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-10-07 15:42:31 +00:00
Harry Mellor	6c04638214	Fix per file ruff ignores related to line length (#26262 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-06 05:12:40 +00:00
Harry Mellor	d6953beb91	Convert formatting to use `ruff` instead of `yapf` + `isort` (#26247 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-10-05 07:06:22 -07:00
Matthew Bonanni	3468f17ebe	[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names (#25489 ) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>	2025-09-25 17:37:50 +00:00
Thomas Parnell	969b4da3a6	[V0 Deprecation] Remove placeholder attn (#25510 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>	2025-09-23 22:12:14 +00:00
Isotr0py	b6a136b58c	[CI/Build] Fix disabled v1 attention backend selection test (#25471 ) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-09-23 13:05:46 +00:00
Woosuk Kwon	bc6e542d9f	Remove V0 attention backends (#25351 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-21 16:03:28 -07:00
Woosuk Kwon	52c2a8d4ad	[V0 Deprecation] Remove LLMEngine (#25033 ) Signed-off-by: Woosuk Kwon <woosuk@thinkingmachines.ai> Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-09-20 17:56:30 -07:00
Michael Goin	087c6ffc92	[CI Bugfix] Fix failing test_invalid_env (#25078 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-09-17 08:28:58 -07:00
Matthew Bonanni	5fe643fc26	Add FLASHINFER_MLA to backend selector test (#24753 ) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>	2025-09-12 22:30:07 +00:00
Lucas Wilkinson	402759d472	[Attention] FlashAttn MLA (#14258 ) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>	2025-09-04 02:47:59 -07:00
Woosuk Kwon	14006840ea	[V0 Deprecation] Remove V0 FlashInfer attention backend (#22776 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-08-18 19:54:16 -07:00
Michael Goin	e79a12fc3a	[UX] Fail if an invalid attention backend is specified (#22217 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2025-08-04 23:54:52 -07:00
Cyrus Leung	9fb52e523a	[V1] Support any head size for FlexAttention backend (#20467 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-07-06 09:54:36 -07:00
Woosuk Kwon	e202dd2736	[V0 deprecation] Remove V0 CPU/XPU/TPU backends (#20412 ) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by: jiang1.li <jiang1.li@intel.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com>	2025-07-06 08:48:13 -07:00
Isotr0py	32c9be2200	[v1] Re-add fp32 support to v1 engine through FlexAttention (#19754 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-07-05 09:41:10 +00:00
TY-AMD	96453cfa83	[BugFix][V1][ROCm] Triton MLA uses V0 backend on V1 engine (#19067 ) Signed-off-by: Tianyuan Wu <Tianyuan.Wu@amd.com>	2025-07-01 16:12:19 +08:00
Isotr0py	5f1ac1e1d1	Revert "[v1] Add fp32 support to v1 engine through flex attn" (#19404 )	2025-06-10 01:30:20 -07:00
Isotr0py	b8089195b4	[v1] Add fp32 support to v1 engine through flex attn (#19319 ) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>	2025-06-09 22:10:44 +08:00
Li, Jiang	4555143ea7	[CPU] V1 support for the CPU backend (#16441 )	2025-06-03 18:43:01 -07:00
Simon Mo	02f0c7b220	[Misc] Add SPDX-FileCopyrightText (#19100 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-06-03 11:20:17 -07:00
tracelogfb	246e3e0a36	fix broken test vllm:test_kernels - test_attention_selector.py::test_flash_attn (#17873 ) Co-authored-by: Stephen Chen <tracelog@meta.com>	2025-05-10 10:46:54 +08:00
vllmellm	3c9396a64f	[FEAT][ROCm]: Support AITER MLA on V1 Engine (#17523 ) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com> Co-authored-by: qli88 <qiang.li2@amd.com> Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>	2025-05-09 10:42:05 +08:00
Michael Goin	6317a5174a	Categorize `tests/kernels/` based on kernel type (#16799 ) Signed-off-by: mgoin <mgoin64@gmail.com>	2025-04-23 09:21:07 -04:00

32 Commits