vllm/core at 533c2177925ba19934eab0095a50d0a783185e6b - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-28 05:07:11 +08:00

History

afeldman-nm 4238bc82f2

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

..

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

block_manager_v1.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

block_manager_v2.py

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837 )

2024-05-29 16:09:13 +00:00

embedding_model_block_manager.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

evictor_v1.py

[Core] Enable prefix caching with block manager v2 enabled (#4142 )

2024-05-01 11:20:32 -07:00

evictor_v2.py

[mypy][6/N] Fix all the core subdirectory typing (#4450 )

2024-05-02 03:01:00 +00:00

interfaces.py

[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )

2024-05-11 11:30:37 -07:00

policy.py

[Chunked Prefill][4/n] Chunked prefill scheduler. (#3853 )

2024-04-05 10:17:58 -07:00

scheduler.py

[Core] Fix scheduler considering "no LoRA" as "LoRA" (#4897 )

2024-05-20 17:48:32 -07:00