vllm/block at a360ff80bb34f9dfcd21cf880c2030daa2d6b3a3 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-18 19:47:22 +08:00

History

afeldman-nm 4238bc82f2

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

..

[Core] Sliding window for block manager v2 (#4545 )

2024-05-28 11:07:07 +09:00

__init__.py

[Core][Bugfix]Refactor block manager for better testability (#3492 )

2024-03-27 23:59:28 -07:00

conftest.py

[Misc] [CI/Build] Speed up block manager CPU-only unit tests ~10x by opting-out of GPU cleanup (#3783 )

2024-04-02 00:49:51 +00:00

test_block_manager_v2.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

test_block_table.py

[Core][Optimization] change copy-on-write from dict[int, list] to list (#4648 )

2024-05-07 11:06:32 -07:00

test_common.py

[Core][Bugfix]Refactor block manager for better testability (#3492 )

2024-03-27 23:59:28 -07:00

test_cpu_gpu_block_allocator.py

[Core][Bugfix]Refactor block manager for better testability (#3492 )

2024-03-27 23:59:28 -07:00

test_naive_block.py

[Core][Bugfix]Refactor block manager for better testability (#3492 )

2024-03-27 23:59:28 -07:00

test_prefix_caching_block.py

[Core][Bugfix]: fix prefix caching for blockv2 (#4764 )

2024-05-24 10:07:09 -07:00