vllm/kernels at 98aa16ff41353e3e6c8a3c2f4e933a888dbce1cb - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-16 04:37:21 +08:00

History

[Kernel] Added flashinfer fp8 per-tensor gemms (#22895 )

Signed-off-by: Julien Lin <jullin@nvidia.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>

2025-08-26 06:54:04 -07:00

attention

[NVIDIA][torch.compile] Support Flashinfer TRTLLM FP8-q/kv NVFP4-out Attention Kernel (#22703 )

2025-08-22 22:09:05 +00:00

core

[Kernel] Add cuda kernel for gpt_oss activation (#22951 )

2025-08-17 05:03:24 +00:00

mamba

[Mamba] - refactor: Renamed mamba_attn to mamba2_attn (#22818 )

2025-08-15 06:38:05 +00:00

moe

[CI Fix] Pin deepep and pplx tags in tools/ep_kernels/, gate multigpu tests (#23568 )

2025-08-25 18:29:00 -07:00

quantization

[Kernel] Added flashinfer fp8 per-tensor gemms (#22895 )

2025-08-26 06:54:04 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

allclose_default.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

quant_utils.py

[Refactor] Remove Duplicate per_block_cast_to_fp8, Remove Dependencies of DeepGEMM (#21787 )

2025-08-01 01:13:27 +00:00

test_apply_repetition_penalties.py

[BUG] Fix #20484 . Support empty sequence in cuda penalty kernel (#20491 )

2025-07-05 19:38:02 -07:00

test_cutlass_mla_decode.py

[NVIDIA] Add Cutlass MLA backend (#17625 )

2025-06-03 21:40:26 -07:00

test_flex_attention.py

Updates to Flex + VLLm integration (#21416 )

2025-08-25 09:32:42 -04:00

test_fused_quant_activation.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

test_onednn.py

[CPU] Refactor CPU W8A8 scaled_mm (#23071 )

2025-08-21 09:34:24 +08:00

test_shuffle_rows.py

[Bugfix] Fix CUDA arch flags for MoE permute (#21426 )

2025-07-24 03:23:59 -07:00

test_triton_flash_attention.py

[Misc] Add SPDX-FileCopyrightText (#19100 )

2025-06-03 11:20:17 -07:00

utils.py

[Kernel] [Quantization] Add MXFP4 and bias support for marlin kernel (#22428 )

2025-08-14 11:23:22 -07:00