vllm/kernels at 63ced7b43f56c4f81b73b0ad176e820f70b2e782 - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-06-24 11:27:15 +08:00

History

Jinzhen Lin 1d0c9d6b2d

[Kernel] some optimizations for dense marlin and moe marlin (#16850 )

Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

2025-05-05 09:39:30 -07:00

..

Update test_flash_attn.py (#17102 )

2025-04-26 22:17:35 +00:00

Categorize tests/kernels/ based on kernel type (#16799 )

2025-04-23 09:21:07 -04:00

Categorize tests/kernels/ based on kernel type (#16799 )

2025-04-23 09:21:07 -04:00

[Kernel] some optimizations for dense marlin and moe marlin (#16850 )

2025-05-05 09:39:30 -07:00

[Kernel] some optimizations for dense marlin and moe marlin (#16850 )

2025-05-05 09:39:30 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

allclose_default.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

quant_utils.py

Add missing rocm_skinny_gemms kernel test to CI (#17060 )

2025-04-24 07:49:37 -07:00

test_cutlass_mla_decode.py

[NVIDIA] Support Cutlass MLA for Blackwell GPUs (#16032 )

2025-04-27 06:29:21 -07:00

test_fused_quant_activation.py

[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations (#10867 )

2025-05-01 07:59:28 -07:00

test_triton_flash_attention.py

[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel (#12591 )

2025-04-27 00:35:08 +00:00

utils.py

[Misc] Replace os environ to monkeypatch in test suite (#14516 )

2025-03-16 20:35:57 -07:00