vllm/attention at 709c9f1f257fd15545ad19b89ed5019cb5ea338b - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-27 07:57:12 +08:00

History

Mengqing Cao 8c1fb50705

[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358 )

Signed-off-by: Mengqing Cao <cmq0113@163.com>

2024-11-19 11:22:26 +08:00

..

[Hardware][CPU] Add embedding models support for CPU backend (#10193 )

2024-11-11 08:54:28 +00:00

[Kernel] Explicitly specify other value in tl.load calls (#9014 )

2024-11-18 11:39:40 -08:00

__init__.py

[Core] Add AttentionState abstraction (#7663 )

2024-08-20 18:50:45 +00:00

layer.py

[Kernel] Support sliding window in flash attention backend (#9403 )

2024-10-20 10:57:52 -07:00

selector.py

[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358 )

2024-11-19 11:22:26 +08:00