vllm/attention at c08e2b30862df5427843de76d8a619ea566600c1 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-04-12 04:27:02 +08:00

History

Antoni Baum 999ef0b917

[Misc] Add numpy implementation of compute_slot_mapping (#7377 )

2024-08-09 22:52:29 +00:00

..

[Misc] Add numpy implementation of compute_slot_mapping (#7377 )

2024-08-09 22:52:29 +00:00

[Bugfix] Allow vllm to still work if triton is not installed. (#6786 )

2024-07-29 14:51:27 -07:00

__init__.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00

layer.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00

selector.py

[Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942 )

2024-08-06 16:51:47 -04:00