This website requires JavaScript.
Explore
Help
Sign In
xinyun
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
mirror of
https://git.datalinker.icu/vllm-project/vllm.git
synced
2026-04-12 16:37:07 +08:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
vllm
/
tests
/
kernels
History
Bill Nell
bbe888d033
wip
...
Signed-off-by: Bill Nell <bnell@redhat.com>
2025-05-28 23:40:27 +00:00
..
attention
[Bugfix][ROCm] fix the power of 2 exception from triton_unified_attention.py when running llama4 models and unit test fix (
#18100
)
2025-05-29 07:21:46 +08:00
core
[Bugfix] fix rotary embedding test for _get_padded_tensor_shape (
#18229
)
2025-05-16 01:32:45 +00:00
mamba
[Model] Mamba2 causal conv1d Refactor to Split Prefill and Decode Requests for Corresponding Kernels (
#17146
)
2025-05-06 17:59:30 -07:00
moe
wip
2025-05-28 23:40:27 +00:00
quantization
[V1][Quantization] Add CUDA graph compatible v1 GGUF support (
#18646
)
2025-05-27 04:40:28 +00:00
__init__.py
[CI/Build] Move
test_utils.py
to
tests/utils.py
(
#4425
)
2024-05-13 23:50:09 +09:00
allclose_default.py
[Misc] Add SPDX-License-Identifier headers to python source files (
#12628
)
2025-02-02 11:58:18 -08:00
quant_utils.py
Add missing rocm_skinny_gemms kernel test to CI (
#17060
)
2025-04-24 07:49:37 -07:00
test_cutlass_mla_decode.py
[NVIDIA] Support Cutlass MLA for Blackwell GPUs (
#16032
)
2025-04-27 06:29:21 -07:00
test_fused_quant_activation.py
[AMD][torch.compile] Enable silu+fp8_quant fusion for rocm (
#18082
)
2025-05-13 22:13:56 -07:00
test_triton_flash_attention.py
[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel (
#12591
)
2025-04-27 00:35:08 +00:00
utils.py
[Misc] Replace os environ to monkeypatch in test suite (
#14516
)
2025-03-16 20:35:57 -07:00