vllm/attention at 68d37809b9b52f4d012fa0dfbb187f0fe978bdbc - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-03-26 10:13:35 +08:00

History

Mengqing Cao 5c7963249d

[attn][tiny fix] fix attn backend in MultiHeadAttention (#11463 )

Signed-off-by: Mengqing Cao <cmq0113@163.com>

2024-12-24 12:39:36 +00:00

..

[Docs] Convert rST to MyST (Markdown) (#11145 )

2024-12-23 22:35:38 +00:00

[Bugfix] Fix chunked prefill with model dtype float32 on Turing Devices (#9850 )

2024-11-25 12:23:32 -05:00

__init__.py

[Core] Add AttentionState abstraction (#7663 )

2024-08-20 18:50:45 +00:00

layer.py

[attn][tiny fix] fix attn backend in MultiHeadAttention (#11463 )

2024-12-24 12:39:36 +00:00

selector.py

[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358 )

2024-11-19 11:22:26 +08:00