vllm/kernels at f8b19c0ffd65f7f6f01a0da4a39b6890f5db40cb - vllm - 丝路新云-代码仓

xinyun/vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-05-05 23:39:12 +08:00

History

amirkl94 03ee48111d

Feature: Support Relu2 in FusedMoE fp8 cutlass path (#27261 )

2025-11-16 13:39:44 -05:00

..

[BugFix] Fix FA3 IMA with FULL_AND_PIECEWISE and cascade attention (default) (#28702 )

2025-11-14 12:19:22 +00:00

[ROCm] [Bugfix] Fix fused_qknorm_rope_kernel rocm compatibility (#28500 )

2025-11-12 05:01:14 -08:00

[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377 )

2025-11-02 04:16:23 -08:00

Feature: Support Relu2 in FusedMoE fp8 cutlass path (#27261 )

2025-11-16 13:39:44 -05:00

[Misc] Make SchedulerConfig.max_model_len init-only (#28733 )

2025-11-15 01:59:31 -08:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

allclose_default.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

quant_utils.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_apply_repetition_penalties.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_fla_layernorm_guard.py

[PERF] [Qwen3-next] Speed up gated RMSNorm (#26207 )

2025-10-12 08:27:50 +00:00

test_flex_attention.py

[V0 Deprecation] Remove VLLM_USE_V1 from tests (#26341 )

2025-10-07 15:42:31 +00:00

test_fused_quant_activation.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_onednn.py

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

test_shuffle_rows.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_top_k_per_row.py

[Deepseek v3.2] Remove extra logics in indexer (#26465 )

2025-10-21 23:34:03 +00:00

utils.py

[V0 deprecation] Remove no longer used get_metadata_cls (#28370 )

2025-11-10 14:32:09 +08:00