vllm/platforms at e7e3e6d2636f6cd012c7ffeff773b20b3c90b958 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-26 22:27:30 +08:00

History

Alexander Matveev 8cdc371217

SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769 )

Signed-off-by: Alexander Matveev <amatveev@redhat.com>

2025-07-15 01:06:38 +00:00

..

__init__.py

[Refactor]Abstract Platform Interface for Distributed Backend and Add xccl Support for Intel XPU (#19410 )

2025-07-07 04:32:32 +00:00

cpu.py

[Doc] Add engine args back in to the docs (#20674 )

2025-07-10 08:02:40 -07:00

cuda.py

SM100 Cutlass MLA decode with unrestricted num_heads (< 128) for DeepSeek TP (#20769 )

2025-07-15 01:06:38 +00:00

hpu.py

[misc]refactor Platform.set_device method (#20262 )

2025-07-09 01:39:47 +00:00

interface.py

[Model] New model support for microsoft/Phi-4-mini-flash-reasoning (#20702 )

2025-07-12 06:02:10 -07:00

neuron.py

[Refactor]Abstract Platform Interface for Distributed Backend and Add xccl Support for Intel XPU (#19410 )

2025-07-07 04:32:32 +00:00

rocm.py

[ROCm][Regression] Remove tensor creation that harms performance on ROCm (#20741 )

2025-07-10 09:22:23 -07:00

tpu.py

[BugFix] Fix VllmConfig() construction on all platforms (#20695 )

2025-07-10 07:00:20 +00:00

xpu.py

[BugFix] Fix VllmConfig() construction on all platforms (#20695 )

2025-07-10 07:00:20 +00:00