vllm/distributed at f5f51e5931ffd99afe69696b60765b88d3eb13f2 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-02 11:07:10 +08:00

History

Roger Wang f5f51e5931

[Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475 )

Signed-off-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Sun Kim <sunytokki@gmail.com>

2025-12-16 14:18:17 -08:00

..

device_communicators

[Perf] Do FP4 quant before All gather on flashinfer trtllmgen MOE (#30014 )

2025-12-16 13:01:48 -08:00

[Core][MM] Optimize encoder cache manager by operating with embeddings only (#30475 )

2025-12-16 14:18:17 -08:00

[ROCm][CI] Add "Qwen3-Next-80B-A3B-Instruct MTP Async EPLB Accuracy Test" Back Into AMD CI (#30590 )

2025-12-14 06:56:26 +00:00

[NIXL][BUG FIX] Fix a bug for PD with host_buffer after merging 29665 (#30420 )

2025-12-14 15:38:28 +00:00

__init__.py

…

communication_op.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

kv_events.py

[KVConnector] Add KV events to KV Connectors (#28309 )

2025-12-11 15:30:29 +01:00

parallel_state.py

[Perf] Do FP4 quant before All gather on flashinfer trtllmgen MOE (#30014 )

2025-12-16 13:01:48 -08:00

tpu_distributed_utils.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

utils.py

[UX] Suppress gloo log spam (#29250 )

2025-11-25 17:19:35 -08:00