vllm/ops at 67b4221a61ace91a79aff507df0a95a01978300e - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-27 15:27:03 +08:00

History

SangBin Cho 67b4221a61

[Core][5/N] Fully working chunked prefill e2e (#3884 )

2024-04-10 17:56:48 -07:00

..

__init__.py

[Core] Refactor Attention Take 2 (#3462 )

2024-03-25 04:39:33 +00:00

paged_attn.py

[Core][5/N] Fully working chunked prefill e2e (#3884 )

2024-04-10 17:56:48 -07:00

prefix_prefill.py

[Core] Refactor Attention Take 2 (#3462 )

2024-03-25 04:39:33 +00:00

triton_flash_attention.py

[Model][AMD] ROCm support for 256 head dims for Gemma (#3972 )

2024-04-10 08:12:00 -07:00