vllm/ops at 3ddbe25502fb8c49e67096ba6e641ecdc3519757 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-05 12:17:13 +08:00

History

wangshuai09 3ddbe25502

[Hardware][CPU] using current_platform.is_cpu (#9536 )

2024-10-22 00:50:43 -07:00

..

blocksparse_attention

[Hardware][CPU] using current_platform.is_cpu (#9536 )

2024-10-22 00:50:43 -07:00

__init__.py

[Core] Refactor Attention Take 2 (#3462 )

2024-03-25 04:39:33 +00:00

ipex_attn.py

[Core/Bugfix] Add FP8 K/V Scale and dtype conversion for prefix/prefill Triton Kernel (#7208 )

2024-08-12 22:47:41 +00:00

paged_attn.py

[Core/Bugfix] Add FP8 K/V Scale and dtype conversion for prefix/prefill Triton Kernel (#7208 )

2024-08-12 22:47:41 +00:00

prefix_prefill.py

[CI/Build] Avoid CUDA initialization (#8534 )

2024-09-18 10:38:11 +00:00

triton_flash_attention.py

[ROCm][AMD][Bugfix] adding a missing triton autotune config (#4845 )

2024-05-16 10:46:52 -07:00