vllm/kernels at 6d98843b31fb6d12fa682fecf584a5b7a4e98491 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-21 20:57:28 +08:00

History

Chih-Chieh Yang b690e34824

[Model] Mamba2 preallocate SSM output tensor to avoid d2d copy overhead (#21075 )

Signed-off-by: Chih-Chieh Yang <7364402+cyang49@users.noreply.github.com>
Signed-off-by: Chih-Chieh-Yang <7364402+cyang49@users.noreply.github.com>

2025-08-02 01:59:34 -07:00

attention

[Bugfix] Fix workspace buffer None issue for Flashinfer TRTLLM Backend (#21525 )

2025-07-29 10:34:00 -04:00

core

[perf] Add fused MLA QKV + strided layernorm (#21116 )

2025-07-22 07:07:44 -07:00

mamba

[Model] Mamba2 preallocate SSM output tensor to avoid d2d copy overhead (#21075 )

2025-08-02 01:59:34 -07:00

moe

[Test] Add Unit Test for Batched DeepGEMM (#21559 )

2025-08-02 10:45:46 +08:00

quantization

[CI] Initial tests for SM100 Blackwell runner (#21877 )

2025-08-01 16:18:38 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

allclose_default.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

quant_utils.py

[Refactor] Remove Duplicate per_block_cast_to_fp8, Remove Dependencies of DeepGEMM (#21787 )

2025-08-01 01:13:27 +00:00

test_apply_repetition_penalties.py

[BUG] Fix #20484 . Support empty sequence in cuda penalty kernel (#20491 )

2025-07-05 19:38:02 -07:00

test_cutlass_mla_decode.py

[NVIDIA] Add Cutlass MLA backend (#17625 )

2025-06-03 21:40:26 -07:00

test_flex_attention.py

[Misc] Add SPDX-FileCopyrightText (#20428 )

2025-07-04 07:40:42 +00:00

test_fused_quant_activation.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_shuffle_rows.py

[Bugfix] Fix CUDA arch flags for MoE permute (#21426 )

2025-07-24 03:23:59 -07:00

test_triton_flash_attention.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

utils.py

[Misc] Add unit tests for MoE ModularKernel combinations + Profiling utility (#20449 )

2025-07-11 07:51:46 -07:00