This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-01-05 09:37:29 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
vllm
/
distributed
/
device_communicators
History
Lain
1fb632fdb6
[Perf] Improve fp8 quant in mla; replace ReduceSum with ReduceScatterSum (
#29795
)
...
Signed-off-by: Siyuan Fu <siyuanf@nvidia.com>
2025-12-08 15:02:34 -08:00
..
__init__.py
…
all2all.py
…
all_reduce_utils.py
…
base_device_communicator.py
…
cpu_communicator.py
…
cuda_communicator.py
[Perf] Improve fp8 quant in mla; replace ReduceSum with ReduceScatterSum (
#29795
)
2025-12-08 15:02:34 -08:00
cuda_wrapper.py
[Core][AMD] Migrate fully transparent sleep mode to ROCm platform (
#12695
)
2025-11-12 15:24:12 -08:00
custom_all_reduce.py
…
mnnvl_compat.py
…
pynccl_allocator.py
[Doc]: fixing typos in various files. (
#29717
)
2025-11-29 01:15:39 -08:00
pynccl_wrapper.py
…
pynccl.py
…
quick_all_reduce.py
…
ray_communicator.py
…
shm_broadcast.py
[MP executor] fix get device count for multi node of mp executor feature (
#30042
)
2025-12-09 01:33:48 +08:00
shm_object_storage.py
[Bugfix] Missing cached item in the MultiModalReceiverCache (
#28525
)
2025-12-01 10:18:07 -08:00
symm_mem.py
Revert "[Bugfix] Fix GPT-OSS AR+NORM fusion (
#28841
)" (
#29483
)
2025-11-26 22:27:26 +08:00
tpu_communicator.py
[TPU] add tpu_inference (
#27277
)
2025-11-26 14:46:36 -08:00
xpu_communicator.py
…