vllm/platforms at 3d2779c29a9f5003f6fec6ca07205147e2c987d1 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-08 12:22:18 +08:00

History

bnellnm f9c069c85e

Modularize fused experts and integrate PPLX kernels (#15956 )

2025-05-14 13:11:54 -07:00

..

__init__.py

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357 )

2025-05-07 00:07:30 -07:00

cpu.py

[Misc] Auto fallback to float16 for pre-Ampere GPUs when detected bfloat16 config (#17265 )

2025-05-09 17:16:12 +00:00

cuda.py

Modularize fused experts and integrate PPLX kernels (#15956 )

2025-05-14 13:11:54 -07:00

hpu.py

[Hardware][Intel-Gaudi] Multi-step scheduling implementation for HPU (#12779 )

2025-04-11 07:38:36 -07:00

interface.py

Update deprecated type hinting in platform, plugins, triton_utils, vllm_flash_attn (#18129 )

2025-05-14 05:28:16 -07:00

neuron.py

Add NeuronxDistributedInference support, Speculative Decoding, Dynamic on-device sampling (#16357 )

2025-05-07 00:07:30 -07:00

rocm.py

Update deprecated type hinting in platform, plugins, triton_utils, vllm_flash_attn (#18129 )

2025-05-14 05:28:16 -07:00

tpu.py

Update deprecated type hinting in platform, plugins, triton_utils, vllm_flash_attn (#18129 )

2025-05-14 05:28:16 -07:00

xpu.py

[Hardware] add platform-specific request validation api (#16291 )

2025-04-09 12:50:01 -07:00