vllm/moe at b9590323e284b13fe9c2a9e69f7cfb5b483f089e - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-07-04 22:07:19 +08:00

History

Signed-off-by: Xinyu Chen <xinyu1.chen@intel.com>

2025-12-17 17:43:00 +08:00

modular_kernel_tools

[Feat] Refactor for parallel_config in FusedMoEModularKernel (#30282 )

2025-12-15 04:21:36 +00:00

__init__.py

[Kernel] DeepEP dispatch-combine kernel integration (#18434 )

2025-06-03 12:30:02 -07:00

parallel_utils.py

[Chore] Separate out optional dependency checks from vllm.utils (#27207 )

2025-10-22 10:44:21 -04:00

test_batched_deepgemm.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_batched_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_block_fp8.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_block_int8.py

[Misc] Make SchedulerConfig.max_model_len init-only (#28733 )

2025-11-15 01:59:31 -08:00

test_count_expert_num_tokens.py

Update Optional[x] -> x | None and Union[x, y] to x | y (#26633 )

2025-10-12 09:51:31 -07:00

test_cutedsl_moe.py

[MoE] Nvfp4 Masked Gemm: Add flashinfer grouped_gemm_nt_masked (#25990 )

2025-11-19 13:29:06 -08:00

test_cutlass_grouped_gemm.py

[Chore]:Extract math and argparse utilities to separate modules (#27188 )

2025-10-26 04:03:32 -07:00

test_cutlass_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_deepep_deepgemm_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_deepep_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_deepgemm.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_flashinfer_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_flashinfer.py

[Feat] Refactor for parallel_config in FusedMoEModularKernel (#30282 )

2025-12-15 04:21:36 +00:00

test_gpt_oss_triton_kernels.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_grouped_topk.py

CustomOp: grouped topk (#29575 )

2025-12-17 17:43:00 +08:00

test_modular_kernel_combinations.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_modular_oai_triton_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_moe_align_block_size.py

[Kernel][MoE] optimize moe_align_block_size (#29642 )

2025-12-07 01:58:47 -08:00

test_moe_permute_unpermute.py

[CI/Build] Only use supported types and features on ROCm in MoE kernel tests (#29149 )

2025-11-21 20:34:33 -07:00

test_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_nvfp4_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_ocp_mx_moe.py

[Feature] Add SM103 (Blackwell Ultra) Support to vLLM (#30484 )

2025-12-12 19:34:23 -08:00

test_pplx_cutlass_moe.py

[Misc] Make SchedulerConfig.max_model_len init-only (#28733 )

2025-11-15 01:59:31 -08:00

test_pplx_moe.py

[Attention] Use sparse prefill kernel for fp8 kv-cache in DeepSeek-v3.2 (#27532 )

2025-12-12 05:57:47 -08:00

test_rocm_aiter_topk.py

Convert formatting to use ruff instead of yapf + isort (#26247 )

2025-10-05 07:06:22 -07:00

test_silu_mul_fp8_quant_deep_gemm.py

[CI/Build] Only use supported types and features on ROCm in MoE kernel tests (#29149 )

2025-11-21 20:34:33 -07:00

test_silu_mul_per_token_group_quant_fp8_colmajor.py

[Performance][DP/EP] Add silu_mul_per_token_group_quant_fp8_colmajor kernel (#29470 )

2025-12-03 18:04:59 +00:00

test_triton_moe_ptpc_fp8.py

[CI/Build] Only use supported types and features on ROCm in MoE kernel tests (#29149 )

2025-11-21 20:34:33 -07:00

utils.py

[Feat] Support non-gated activations in NVFP4 modelopt path (#29004 )

2025-11-30 11:02:40 -05:00