vllm/kernels at e019391cd85bc474d38b4a590a3bbe297cdcddeb - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-19 01:47:14 +08:00

History

Jhao-Ting Chen e019391cd8 refine commit, polish PR

Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>

2025-12-24 11:19:40 -08:00

attention

[Misc] Remove unused custom ops copy_blocks and copy_blocks_mla (#30967 )

2025-12-23 18:22:35 -08:00

core

[Kernel] Enable fused_qknorm_rope_kernel supports partial rope (#30821 )

2025-12-21 18:39:22 -08:00

mamba

Add SpecDec support to selective_state_update (#29488 )

2025-12-08 16:45:18 -05:00

moe

[MoE Refactor][9/N] Use modular kernel for unquantized Triton MoE (#31052 )

2025-12-22 17:34:19 +00:00

quantization

refine commit, polish PR

2025-12-24 11:19:40 -08:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

allclose_default.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

quant_utils.py

[CI/Build][AMD] Fix ref_dynamic_per_token_quant reference implementation on ROCm. (#30291 )

2025-12-12 09:30:23 +00:00

test_apply_repetition_penalties.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_cache_kernels.py

[Bugfix][cache_kernels]: Fix OOB in cache_kernels.cu (#28760 )

2025-11-20 02:52:02 -08:00

test_fla_layernorm_guard.py

[PERF] [Qwen3-next] Speed up gated RMSNorm (#26207 )

2025-10-12 08:27:50 +00:00

test_flex_attention.py

[Fix][FlexAttention] return max logical block index to handle reused blocks (#30915 )

2025-12-18 06:42:21 +00:00

test_fused_quant_activation.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_onednn.py

[CPU] Refactor CPU attention backend (#27954 )

2025-11-12 09:43:06 +08:00

test_shuffle_rows.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_top_k_per_row.py

[DeepSeek v3.2] Make top-k work for any logit values. (#27568 )

2025-12-08 06:55:58 -08:00

utils.py

[Feat] Support non-gated activations in NVFP4 modelopt path (#29004 )

2025-11-30 11:02:40 -05:00