vllm/attention at dac6a3f6ed14ea4061b672f9290bfdf8bcdd996d - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-03-27 02:43:44 +08:00

History

Woosuk Kwon 89579a201f

[Misc] Use vllm-flash-attn instead of flash-attn (#4686 )

2024-05-08 13:15:34 -07:00

..

[Misc] Use vllm-flash-attn instead of flash-attn (#4686 )

2024-05-08 13:15:34 -07:00

[Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659 )

2024-05-08 12:07:05 -07:00

__init__.py

[Core][5/N] Fully working chunked prefill e2e (#3884 )

2024-04-10 17:56:48 -07:00

layer.py

[Misc]Add customized information for models (#4132 )

2024-04-30 21:18:14 -07:00

selector.py

[Misc] Use vllm-flash-attn instead of flash-attn (#4686 )

2024-05-08 13:15:34 -07:00