vllm/device_communicators at 9f042ba26b59e1bfc9bef031165033fa931f3457 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-05 09:37:29 +08:00

History

[Perf] Improve fp8 quant in mla; replace ReduceSum with ReduceScatterSum (#29795 )

Signed-off-by: Siyuan Fu <siyuanf@nvidia.com>

2025-12-08 15:02:34 -08:00

__init__.py

…

all2all.py

…

all_reduce_utils.py

…

base_device_communicator.py

…

cpu_communicator.py

…

cuda_communicator.py

[Perf] Improve fp8 quant in mla; replace ReduceSum with ReduceScatterSum (#29795 )

2025-12-08 15:02:34 -08:00

cuda_wrapper.py

[Core][AMD] Migrate fully transparent sleep mode to ROCm platform (#12695 )

2025-11-12 15:24:12 -08:00

custom_all_reduce.py

…

mnnvl_compat.py

…

pynccl_allocator.py

[Doc]: fixing typos in various files. (#29717 )

2025-11-29 01:15:39 -08:00

pynccl_wrapper.py

…

pynccl.py

…

quick_all_reduce.py

…

ray_communicator.py

…

shm_broadcast.py

[MP executor] fix get device count for multi node of mp executor feature (#30042 )

2025-12-09 01:33:48 +08:00

shm_object_storage.py

[Bugfix] Missing cached item in the MultiModalReceiverCache (#28525 )

2025-12-01 10:18:07 -08:00

symm_mem.py

Revert "[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841 )" (#29483 )

2025-11-26 22:27:26 +08:00

tpu_communicator.py

[TPU] add tpu_inference (#27277 )

2025-11-26 14:46:36 -08:00

xpu_communicator.py

…