vllm/engine at 98cef6a2278750ce7578ee6d6ae91e53d01c77a5 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 07:17:11 +08:00

History

Cyrus Leung 98cef6a227

[Core] Increase default max_num_batched_tokens for multimodal models (#8028 )

2024-08-30 08:20:34 -07:00

..

output_processor

[Core] Logprobs support in Multi-step (#7652 )

2024-08-29 19:19:08 -07:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

arg_utils.py

[Core] Increase default max_num_batched_tokens for multimodal models (#8028 )

2024-08-30 08:20:34 -07:00

async_llm_engine.py

[Core] Logprobs support in Multi-step (#7652 )

2024-08-29 19:19:08 -07:00

async_timeout.py

[Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654 )

2024-06-19 13:57:12 -07:00

llm_engine.py

[Core] Increase default max_num_batched_tokens for multimodal models (#8028 )

2024-08-30 08:20:34 -07:00

metrics_types.py

[MISC] Add prefix cache hit rate to metrics (#7606 )

2024-08-19 11:52:07 -07:00

metrics.py

[MISC] Add prefix cache hit rate to metrics (#7606 )

2024-08-19 11:52:07 -07:00

protocol.py

[Core] Logprobs support in Multi-step (#7652 )

2024-08-29 19:19:08 -07:00