vllm/attention at 6d7f037748b2e7df64f3318e54101a1c80016f3c - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-22 16:47:07 +08:00

History

Luka Govedič e1744502c2

[FP8] Refactor apply_fp8_linear and apply_fp8_linear_generic into an object (#14390 )

Signed-off-by: luka <luka@neuralmagic.com>

2025-03-07 05:20:16 +00:00

..

[FP8] Refactor apply_fp8_linear and apply_fp8_linear_generic into an object (#14390 )

2025-03-07 05:20:16 +00:00

Add authors to license header. (#14371 )

2025-03-06 08:43:09 -08:00

__init__.py

[Attention] MLA with chunked prefill (#12639 )

2025-02-21 15:30:12 -08:00

layer.py

[Bug] Fix Attention when ignored in by quant_method (#14313 )

2025-03-06 14:18:06 -08:00

selector.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00