vllm/layers at e04492449eeb1ca945ce08b2740ea75bddd0c8a9 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2026-01-25 16:34:33 +08:00

History

[Feature] Extend batch invariant torch.compile to B200 (#27856 )

Signed-off-by: PaulZhang12 <paulzhan@fb.com>

2025-11-05 10:04:49 -08:00

fla

[Bugfix][plugin] fla crash on plugin (#27322 )

2025-11-04 05:27:03 +08:00

fused_moe

[XPU] Enable custom routing functions in IPEX for Llama4 (#28004 )

2025-11-05 13:39:57 +00:00

mamba

[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377 )

2025-11-02 04:16:23 -08:00

quantization

Bugfix: Cutlass FP8 FusedMoE bad scaling factors (#27255 )

2025-11-05 06:06:06 -05:00

rotary_embedding

[Bugfix][ROCm] Fix ViT rotary embeddings for torch.compile compatibility on ROCm (#27748 )

2025-11-03 17:12:19 -08:00

__init__.py

…

activation.py

…

attention_layer_base.py

…

batch_invariant.py

[Feature] Extend batch invariant torch.compile to B200 (#27856 )

2025-11-05 10:04:49 -08:00

kda.py

[Bugfix] Fix KDA output (#27905 )

2025-11-01 11:54:36 +08:00

layernorm.py

Revert "[PERF] Decouple projections from GDN custom op" (#28080 )

2025-11-04 15:58:23 -08:00

lightning_attn.py

…

linear.py

…

logits_processor.py

…

mla.py

[Model] Introduce Kimi Linear to vLLM (#27809 )

2025-10-30 21:02:27 +08:00

pooler.py

[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524 )

2025-10-30 12:13:05 +00:00

resampler.py

…

utils.py

[ROCm] gemm_a16w16 upstreaming (#26969 )

2025-11-04 16:01:00 -05:00

vocab_parallel_embedding.py

…