vllm/device_communicators at f5f51e5931ffd99afe69696b60765b88d3eb13f2 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-02 14:17:16 +08:00

History

[Perf] Do FP4 quant before All gather on flashinfer trtllmgen MOE (#30014 )

Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>

2025-12-16 13:01:48 -08:00

__init__.py

…

all2all.py

[Perf] Do FP4 quant before All gather on flashinfer trtllmgen MOE (#30014 )

2025-12-16 13:01:48 -08:00

all_reduce_utils.py

[Chore] Separate out system utilities from vllm.utils (#27201 )

2025-10-22 20:25:25 +00:00

base_device_communicator.py

[Perf] Do FP4 quant before All gather on flashinfer trtllmgen MOE (#30014 )

2025-12-16 13:01:48 -08:00

cpu_communicator.py

…

cuda_communicator.py

[Perf] Do FP4 quant before All gather on flashinfer trtllmgen MOE (#30014 )

2025-12-16 13:01:48 -08:00

cuda_wrapper.py

[Core][AMD] Migrate fully transparent sleep mode to ROCm platform (#12695 )

2025-11-12 15:24:12 -08:00

custom_all_reduce.py

[Log] Optimize Startup Log (#26740 )

2025-10-24 19:27:04 -04:00

mnnvl_compat.py

…

pynccl_allocator.py

[Doc]: fixing typos in various files. (#29717 )

2025-11-29 01:15:39 -08:00

pynccl_wrapper.py

[Chore] Separate out NCCL utilities from vllm.utils (#27197 )

2025-10-21 06:18:23 -07:00

pynccl.py

[Log] Optimize Startup Log (#26740 )

2025-10-24 19:27:04 -04:00

quick_all_reduce.py

[Chore] Clean up pytorch helper functions in vllm.utils (#26908 )

2025-10-18 09:48:22 -07:00

ray_communicator.py

[Misc] Avoid "PyTorch non-writable tensors" warning in RayPPCommunicator (#27443 )

2025-10-24 14:53:09 +08:00

shm_broadcast.py

fix(shm): Add memory barriers for cross-process shared memory visibility (#30407 )

2025-12-10 23:01:19 +00:00

shm_object_storage.py

[Bugfix] Missing cached item in the MultiModalReceiverCache (#28525 )

2025-12-01 10:18:07 -08:00

symm_mem.py

Revert "[Bugfix] Fix GPT-OSS AR+NORM fusion (#28841 )" (#29483 )

2025-11-26 22:27:26 +08:00

tpu_communicator.py

[TPU] add tpu_inference (#27277 )

2025-11-26 14:46:36 -08:00

xpu_communicator.py

[UX] Replace VLLM_ALL2ALL_BACKEND with --all2all-backend (#26732 )

2025-10-13 18:12:52 -07:00