vllm/engine at 5b2dcbf0b8dd9ee9199d7496c84e84c010122a00 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-25 10:17:15 +08:00

History

[FEAT][ROCm]: Support AITER MLA on V1 Engine (#17523 )

Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Co-authored-by: qli88 <qiang.li2@amd.com>
Co-authored-by: Hongxia Yang <62075498+hongxiayang@users.noreply.github.com>

2025-05-09 10:42:05 +08:00

multiprocessing

Improve exception reporting in MP engine (#17800 )

2025-05-08 05:32:39 +00:00

output_processor

Add full API docs and improve the UX of navigating them (#17485 )

2025-05-03 19:42:43 -07:00

__init__.py

Change the name to vLLM (#150 )

2023-06-17 03:07:40 -07:00

arg_utils.py

[FEAT][ROCm]: Support AITER MLA on V1 Engine (#17523 )

2025-05-09 10:42:05 +08:00

async_llm_engine.py

Add full API docs and improve the UX of navigating them (#17485 )

2025-05-03 19:42:43 -07:00

async_timeout.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

llm_engine.py

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357 )

2025-05-07 00:07:30 -07:00

metrics_types.py

[V1][Metrics] Support vllm:cache_config_info (#13299 )

2025-02-22 00:20:00 -08:00

metrics.py

[Metrics] Fix minor inconsistencies in bucket progression (#17262 )

2025-04-27 16:19:39 +00:00

protocol.py

[Misc] Clean up input processing (#17582 )

2025-05-02 08:11:53 -07:00