vllm/engine at e7391949267a4eff3d84f02119f442f46b16d163 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-26 01:37:16 +08:00

History

Joe Runde 9b9cef3145

[Bugfix] Backport request id validation to v0 (#11036 )

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

2024-12-10 16:38:23 +00:00

..

multiprocessing

[Bugfix] Backport request id validation to v0 (#11036 )

2024-12-10 16:38:23 +00:00

output_processor

[Doc] Create a new "Usage" section (#10827 )

2024-12-05 11:19:35 +08:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

arg_utils.py

[V1] Initial support of multimodal models for V1 re-arch (#10699 )

2024-12-08 12:50:51 +00:00

async_llm_engine.py

[Core][Performance] Add XGrammar support for guided decoding and set it as default (#10785 )

2024-12-03 15:17:00 +08:00

async_timeout.py

[Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654 )

2024-06-19 13:57:12 -07:00

llm_engine.py

monitor metrics of tokens per step using cudagraph batchsizes (#11031 )

2024-12-09 22:35:36 -08:00

metrics_types.py

monitor metrics of tokens per step using cudagraph batchsizes (#11031 )

2024-12-09 22:35:36 -08:00

metrics.py

monitor metrics of tokens per step using cudagraph batchsizes (#11031 )

2024-12-09 22:35:36 -08:00

protocol.py

[Misc] Rename embedding classes to pooling (#10801 )

2024-12-01 14:36:51 +08:00