vllm/kernels at d45417b804aaf7f90c9ae70a32f8f07d6b371a8c - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-02 21:27:21 +08:00

History

Wentao Ye 562308816c

[Refactor] Rename commnication utils (#20091 )

Signed-off-by: yewentao256 <zhyanwentao@126.com>

2025-06-26 22:19:32 +00:00

..

[Refactor] Remove duplicate ceil_div (#20023 )

2025-06-25 05:19:09 +00:00

[CI] change spell checker from codespell to typos (#18711 )

2025-06-11 19:57:10 -07:00

[CI] change spell checker from codespell to typos (#18711 )

2025-06-11 19:57:10 -07:00

[Refactor] Rename commnication utils (#20091 )

2025-06-26 22:19:32 +00:00

[Kernels][Bugfix] Use torch op for all kernels in FusedMoE forward. Add additional testing for cudagraphs. (#19717 )

2025-06-24 23:22:58 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

allclose_default.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

quant_utils.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_apply_repetition_penalties.py

[KERNEL] Sampler. CUDA kernel for applying repetition penalty (#18437 )

2025-06-03 21:13:01 -07:00

test_cutlass_mla_decode.py

[NVIDIA] Add Cutlass MLA backend (#17625 )

2025-06-03 21:40:26 -07:00

test_flex_attention.py

Fixes IMA for TP w/ flex-attention (#19712 )

2025-06-17 04:01:50 +00:00

test_fused_quant_activation.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_triton_flash_attention.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

utils.py

[Kernels][Bugfix] Use torch op for all kernels in FusedMoE forward. Add additional testing for cudagraphs. (#19717 )

2025-06-24 23:22:58 -07:00