vllm/layers at 93103575ce0480f36fc1a3603eb51d9a89f38a00 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-24 03:45:31 +08:00

History

[Misc] add ignore mapper for quark quantization (#28275 )

Signed-off-by: Haoyang Li <lihaoyang0109@gmail.com>

2025-11-14 05:56:35 +00:00

fla

[Bugfix][plugin] fla crash on plugin (#27322 )

2025-11-04 05:27:03 +08:00

fused_moe

[Performance][B200] silu_mul_quant: pack scales in int32 (#28358 )

2025-11-13 10:16:55 -08:00

mamba

[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377 )

2025-11-02 04:16:23 -08:00

quantization

[Misc] add ignore mapper for quark quantization (#28275 )

2025-11-14 05:56:35 +00:00

rotary_embedding

[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165 )

2025-11-11 12:00:31 -05:00

__init__.py

…

activation.py

…

attention_layer_base.py

…

batch_invariant.py

[Core] Cache vllm_is_batch_invariant (#28304 )

2025-11-12 05:03:01 +00:00

kda.py

[Bugfix] fix kimi-linear crash (#28445 )

2025-11-13 07:59:58 +00:00

layernorm.py

[RFC][ROCm][AITER] Keep all AITER kernels in _aiter_ops class like _custom_ops and _ipex_ops (#24490 )

2025-11-10 08:20:53 -08:00

lightning_attn.py

…

linear.py

…

logits_processor.py

…

mla.py

[Model] Introduce Kimi Linear to vLLM (#27809 )

2025-10-30 21:02:27 +08:00

pooler.py

[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524 )

2025-10-30 12:13:05 +00:00

resampler.py

…

utils.py

[platform] Move get_cu_count to utils (#27005 )

2025-11-13 08:48:47 +08:00

vocab_parallel_embedding.py

…