vllm/layers at 1a0b157a2ea46eebd69072f78e5a97ece4f6a2e7 - vllm

mirror of https://git.datalinker.icu/vllm-project/vllm.git synced 2025-12-21 08:45:01 +08:00

History

[platform] Move get_cu_count to utils (#27005 )

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2025-11-13 08:48:47 +08:00

fla

[Bugfix][plugin] fla crash on plugin (#27322 )

2025-11-04 05:27:03 +08:00

fused_moe

[MoE][Kernel][Perf] Improve Shared Expert Stream Overlap (#28406 )

2025-11-12 23:37:24 +00:00

mamba

[V1] [Hybrid] Mamba1 Automatic Prefix Caching (#26377 )

2025-11-02 04:16:23 -08:00

quantization

[platform] Move get_cu_count to utils (#27005 )

2025-11-13 08:48:47 +08:00

rotary_embedding

[Kernel][Perf] fuse QK Norm and RoPE into one cuda kernel for Qwen Model (#27165 )

2025-11-11 12:00:31 -05:00

__init__.py

…

activation.py

…

attention_layer_base.py

…

batch_invariant.py

[Core] Cache vllm_is_batch_invariant (#28304 )

2025-11-12 05:03:01 +00:00

kda.py

[Bugfix] Fix KDA output (#27905 )

2025-11-01 11:54:36 +08:00

layernorm.py

[RFC][ROCm][AITER] Keep all AITER kernels in _aiter_ops class like _custom_ops and _ipex_ops (#24490 )

2025-11-10 08:20:53 -08:00

lightning_attn.py

…

linear.py

…

logits_processor.py

…

mla.py

[Model] Introduce Kimi Linear to vLLM (#27809 )

2025-10-30 21:02:27 +08:00

pooler.py

[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. (#25524 )

2025-10-30 12:13:05 +00:00

resampler.py

…

utils.py

[platform] Move get_cu_count to utils (#27005 )

2025-11-13 08:48:47 +08:00

vocab_parallel_embedding.py

…