vllm/engine at 1d5e397aa4d94d0ccc1c9dbad533afa5cb60bb69 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-22 19:47:14 +08:00

History

Cody Yu b1f3e18958

[MISC] Keep chunked prefill enabled by default with long context when prefix caching is enabled (#8342 )

2024-09-10 22:28:28 +00:00

..

output_processor

[Core] Optimize Async + Multi-step (#8050 )

2024-09-03 18:50:29 +00:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

arg_utils.py

[MISC] Keep chunked prefill enabled by default with long context when prefix caching is enabled (#8342 )

2024-09-10 22:28:28 +00:00

async_llm_engine.py

[Bugfix] Fix async postprocessor in case of preemption (#8267 )

2024-09-07 21:01:51 -07:00

async_timeout.py

[Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654 )

2024-06-19 13:57:12 -07:00

llm_engine.py

[Bugfix] Fix async postprocessor in case of preemption (#8267 )

2024-09-07 21:01:51 -07:00

metrics_types.py

[MISC] Add prefix cache hit rate to metrics (#7606 )

2024-08-19 11:52:07 -07:00

metrics.py

[MISC] Add prefix cache hit rate to metrics (#7606 )

2024-08-19 11:52:07 -07:00

protocol.py

[Core] Logprobs support in Multi-step (#7652 )

2024-08-29 19:19:08 -07:00