vllm/v1 at 42bb201fd6f79d6ed2e28e0263ffa891cd993c4c - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-11 12:41:27 +08:00

History

Woosuk Kwon 42bb201fd6

[V1][Minor] Set pin_memory=False for token_ids_cpu tensor (#11581 )

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

2024-12-28 13:33:12 +00:00

..

Enable mypy checking on V1 code (#11105 )

2024-12-14 09:54:04 -08:00

[V1] Simplify prefix caching logic by removing num_evictable_computed_blocks (#11310 )

2024-12-19 04:17:12 +00:00

[V1] [4/N] API Server: ZMQ/MP Utilities (#11541 )

2024-12-28 01:45:08 +00:00

[V1] [4/N] API Server: ZMQ/MP Utilities (#11541 )

2024-12-28 01:45:08 +00:00

[V1] Fix yapf (#11538 )

2024-12-27 09:47:10 +09:00

[V1][Minor] Set pin_memory=False for token_ids_cpu tensor (#11581 )

2024-12-28 13:33:12 +00:00

__init__.py

[V1] AsyncLLM Implementation (#9826 )

2024-11-11 23:05:38 +00:00

outputs.py

[V1] Multiprocessing Tensor Parallel Support for v1 (#9856 )

2024-12-10 06:28:14 +00:00

request.py

[V1] Prefix caching for vision language models (#11187 )

2024-12-17 16:37:59 -08:00

serial_utils.py

[V1] Use pickle for serializing EngineCoreRequest & Add multimodal inputs to EngineCoreRequest (#10245 )

2024-11-12 08:57:14 -08:00

utils.py

[V1] [4/N] API Server: ZMQ/MP Utilities (#11541 )

2024-12-28 01:45:08 +00:00