vllm/model_executor at 541a2ef892720489f770569417bc1bc4436dbb21 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-14 00:07:22 +08:00

History

Wentao Ye 541a2ef892

[Perf] Deepgemm fused layout kernel for activations, 4.3% throughput improvement, 10.7% TTFT improvement. (#29546 )

Signed-off-by: yewentao256 <zhyanwentao@126.com>

2025-12-07 20:31:14 +08:00

..

[Perf] Deepgemm fused layout kernel for activations, 4.3% throughput improvement, 10.7% TTFT improvement. (#29546 )

2025-12-07 20:31:14 +08:00

[Bugfix][Quantization] Support BF16 tensors on GGUF (#29948 )

2025-12-03 10:33:46 +00:00

Revert "[Renderer] Separate out RendererConfig from ModelConfig (#30145 )" (#30199 )

2025-12-07 00:00:22 -08:00

[Core] Encoder separation for Encode-Prefill-Decode Disaggregation (#25233 )

2025-11-11 18:58:33 -08:00

__init__.py

…

custom_op.py

…

parameter.py

…

utils.py

[CI] Fix mypy for vllm/v1/worker (#29037 )

2025-11-21 11:36:07 +08:00